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Abstract 

We show that Laplacian and symmetric diagonally dominant (SDD) matrices can be well 
approximated by linear-sized sparse Cholesky factorizations. Specifically, n x n matrices 
of these types have constant-factor approximations of the form LL T , where I is a lower- 
triangular matrix with 0(n) non-zero entries. This factorization allows us to solve linear 
systems in such matrices in 0(n) work and O (log n log 2 log n) depth. 

We also present nearly linear time algorithms that construct solvers that are almost this 
efficient. In doing so, we give the first nearly-linear work routine for constructing spec¬ 
tral vertex sparsifiers—that is, spectral approximations of Schur complements of Laplacian 
matrices. 


1 Introduction 


There have been incredible advances in the design of algorithms for solving systems of linear 
equations in Laplacian and symmetric, diagonally dominant (SDD) matrices. Cohen et. al. 


CKM + 14 have recently designed algorithms that find e-approximate solutions to such systems 
of equations in time 0(m log 1 / 2 n log W 1 ), where n is the dimension of the matrix and m is 
its number of nonzero entries. Peng and Spielman [PS 14] recently discovered the first parallel 
algorithms that require only poly-logarithmic time and nearly-linear work. In this paper, we 
prove that for every such matrix there is an operator that approximately solves equations in this 
matrix and that can be evaluated in linear work and depth 0(logn(loglogn) 2 ). These operators 
are analogous to the LU decompositions produced by Gaussian elimination: they take longer to 
compute than to apply. 

We present two fast parallel algorithms for finding solvers that are almost as fast. One runs 
in nearly linear time and polylogarithmic depth (Theorem 19.21) . The algorithm presented in 
Theorem 19.81 has preprocessing depth n 0 ^, but is more efficient in terms of work and produces 
a solver whose work and depth are within a logarithmic factor of the best one we can show 
exists. 


‘Supported in part by NSF awards 0843915 and 1111109. Part of this work was done while visiting the Simons 
Institute for the Theory of Computing, UC Berkeley. 

^Supported by AFOSR Award FA9550-12-1-0175, NSF grant CCF-1111257, a Simons Investigator Award, and 
a Mac Arthur Fellowship. 
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A matrix A is diagonally dominant if each of its diagonal entries is at least the sum of the 
absolute values of the off-diagonal entries in its row. The most famous symmetric, diagonally 
dominant matrices are the Laplacian matrices of graphs: those with non-positive off-diagonal 
such that every diagonal is exactly equal to the sum of the absolute values of the off-diagonal 
entries in its row. Laplacian and SDD matrices arise in many applications, including the solution 
of optimization problems such as maximum flow |CKM + lH IKMP121ILRS131 IMadl3| , minimium 
cost flow [ DS081 ILS13] , semi-supervised learning [ZGL03j . and the solution of elliptic PDEs 
|BHV08| . 

Building on the work of Vaidya [VaiQOj , Spielman and Teng [ST 14] discovered that through 
the use of two constructions in graph theory—sparsifiers and low stretch spanning trees—one 
could design algorithms for solving such linear equations that run in nearly-linear time. Kelner 
et. al. [KOSZ13] construct an elementary algorithm for solving SDD systems in nearly linear 
time that only makes use of low stretch spanning trees. Conversely, Peng and Spielman |PS14j 
design an algorithm that only uses sparsifiers. The present paper builds on their approach. 

The parallel algorithm of Peng and Spielman |PS14j approximates the inverse of a matrix 
by the sum and product of a small number of sparse matrices. The main bottleneck in their 
algorithm is that all of the matrices it produces have the same dimension, and that the number 
of these matrices depends on the condition number of the system to be solved. This leads to 
each matrix having an average number of nonzero entries per column that is proportional to the 
square of the logarithmic of the condition number, leading to work 0((m + nlog 3 n) loge” 1 ). 

Our result improves on the construction of Peng and Spielman |PS 14] in a number of ways. 
First, the depth and work of our new algorithms are independent of the condition number of the 
matrix. Second, the matrices in the product that approximates the inverse are of geometrically 
decreasing sizes. This leads to much faster algorithms. That said, our efficient algorithms for 
constructing solvers and spectral vertex sparsifiers critically relies on their work. 

We introduce sparsified Cholesky factorization in in Section [5] where we prove that the 
inverse of every SDD matrix A can be approximated by an operator that can be evaluated in 
linear work and depth O (log 2 nlog log n). By using this operator as a preconditioner, or by 
applying iterative refinement, this leads to a solver that produces e-approximate solutions to 
systems in A in work 0{m log e^ 1 ) and depth 0(log 2 nloglognloge -1 ), where m is the number 
of nonzeros in A. We begin by eliminating a block consisting of a constant fraction of the 
vertices. The elimination of these vertices adds edges to the subgraph induced on the remaining 
vertices. We use the work of [BSS12] to sparsify the modified subgraph (Figure [2] Lemma [5781 
and Theorem 15.101) . The choice of which vertices we eliminate is important. We use subset of 
vertices whose degrees in their induced subgraph are substantially smaller than in the original 
graph (see Definition 15.11 and Lemma 15.2]) . 

In Section [6] we show how to convert this solver into a sparse approximate inverse. That is, 
we show that A can be approximated by a product of the form U 1 D U where U an upper- 
triangular matrix with 0(n) nonzero entries and D is diagonal. While we can construct this U 
and D in polynomial time, we do not yet have a nearly linear time or low depth efficient parallel 
algorithm that does so. 

We obtain our best existence result in Section [7] by reducing the depth of the parallel solvers 
by a logarthmic factor.The reduction comes from observing that the construction of Section 
[6] would have the desired depth if every vertex in A and in the smaller graphs produced had 
bounded degree. While we can use sparsification to approximate an arbitrary graph by a sparse 


2 




























one, the sparse one need not have bounded degree. We overcome this problem by proving that 
the Laplacian of every graph can be approximated by a Schur complement of the Laplacian of 
a larger graph of bounded degree (Theorem 17.21) . 

We then turn to the problem of computing our solvers efficiently in parallel. The first 
obstacle is that we must quickly compute an approximation of a Schur complement of a set of 
vertices without actually constructing the Schur complement, as it could be too large. This 
is the problem we call Spectral Vertex Sparsification. It is analogous to the problem of vertex 
sparsfication for cut and combinatorial flow problems |LM10l IMoil3j : given a subset of the 
vertices we must compute a graph on those vertices that allows us to compute approximations 
of electrical flows in the original graph between vertices in that subset. In contrast with cut 
and combinatorial flow problems, there is a graph that allows for this computation exactly on 
the subset of vertices, and it is the Schur complement in the graph Laplacian. In Section 0 
we build on the techniques of |PS14] to give an efficient algorithm for spectrally approximating 
Schur complements. 

The other obstacle is that we need to compute sparsifications of graphs efficiently in parallel. 
We examine two ways of doing this in Section [9l The first, examined in Section f9.ll is to use a 
black-box parallel algorithm for graph sparsification, such as that of Koutis jKou!4| . This gives 
us our algorithm of best total depth. The second, examined in Section 19.21 employs a recursive 
scheme in which we solve smaller linear systems to compute probabilities with which we sample 
the edges, as in jSSllj . Following }CLM + 14 |. these smaller linear systems are obtained by crudely 
sub-sampling the original graph. The resulting algorithm runs in depth n°V\ but produces a 
faster solver. We expect that further advances in graph sparsification such as [AZLQ15] will 
result in even better algorithms. 

2 Some Related Work 

Gaussian elimination solves systems of equations in a matrix A by computing lower and upper 
triangular matrices L and U so that A = LU. Equations in A may then be solved by solving 
equations in L and U, which takes time proportional to the number of nonzero entries in those 
matrices. This becomes slow if L or U has many nonzero entries, with is often the case. 

Cholesky factorization is the natural symmetrization of this process: it writes symmetric 
matrices A as a product LL 1 . Incomplete Cholesky factorizations [MV77| instead approximate 
A by a product of sparse matrices LL 1 by strategically dropping some entries in the computation 
of Cholesky factors. One can then use these approximations as preconditioners to compute highly 
accurate solutions to systems in A. While this is a commonly used heuristic, there have been 
few general theoretical analyses of the performance of the resulting algorithms. Interestingly, 
Meijerink and van der Vorst |MV77] analyze the performance of this algorithm on SDD matrices 
whose underlying graph is a regular grid. 

SDD linear systems have been extensively studied in scientific computing as they arise when 
solving elliptic partial differential equations. Multigrid methods have proved very effective at 
solving the resulting systems. Fedorenko |Fed64| gave the first multigrid method for SDD systems 
on regular square grids and proved that it is an nearly-linear time algorithm. Multigrid methods 
have since been used to solve many types of linear systems jBra77l IHac85j . and have been 
shown to solve special systems in linear work and logarithmic depth |Nic78l IHac82| under some 
smoothness assumptions. Recently, Artem and Yvan |NN12| gave the first algebraic multigrid 
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method with a guranteed convergence rate. However, to the best of our knowledge, a worst-case 
nearly-linear work bound has not been proved for any of these algorithms. 

Our algorithm is motivated both by multigrid methods and incomplete Choleksy factoriza¬ 
tions. Both exploit the fact that elimination operations in SDD matrices result in SDD matrices. 
That is, Schur complements of SDD matrices result in SDD matrices with fewer vertices. How¬ 
ever, where multigrid methods eliminate a large fraction of vertices at each level, our algorithms 
eliminate a small but constant fraction. The main novelty of our approach is that we sparsify 
the resulting Schur complement. A heuristic approach to doing this was recently studied by 
Krishnan, Fattal, and Szeliski jKFSldj . 

3 Background 

We will show that diagonally dominant matrix A can be well-approximated by a product U 1 D U 
where U is upper-triangular and sparse and D is diagonal. By solving linear equations in each 
of these matrices, we can quickly solve a system of linear equations in A. We now review the 
notion of approximation that we require along with some of its standard properties. 

For symmetric matrices A and B, we write A ip B if A — B is positive semidehnite. The 
ordering given by ip is called the “Loewner partial order”. 

Fact 3.1. For A and B positive definite, A ip B if and only if B^ 1 ip A~ 1 . 

Fact 3.2. If A ip B and C is any matrix of compatible dimension, then CAC T ip CBC T . 

We say that A is an e-approximation of B, written A B, if 

e e B ip A ip e~ 6 B. 

Observe that this relation is symmetric. Simple arithmetic yields the following fact about 
compositions of approximations. 

Fact 3.3. If A B and B C , then A ~ e+< 5 C . 

We say that x is an e-approximate solution to the system Ax = b if 

11a: — AT 1 61|^ < e 11*11^ , 

where 

11*11^ = (x T Ax) 1 / 2 . 

This is the notion of approximate solution typically used when analyzing preconditioned linear 
system solvers, and it is the notion assumed in the works we reference that use these solvers as 
subroutines. 

Fact 3.4. If e < 1/2, A ~ e B and Bx = b, then x is a 2y/e approximate solution to Ax = b. 

So, if one can find a matrix B that is a good approximation of A and such that one can quickly 
solve linear equations in B , then one can quickly compute approximate solutions to systems of 
linear equations in A. Using methods such as iterative refinement , one can use multiple solves 
in B and multiplies by A to obtain arbitrarily good approximations. For example, if B is a 
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constant approximation of A , then for every e < 1, one can obtain an e approximate solution 
of a linear system in A by performing 0(log(e -1 )) solves in B and multiplies by A (see, for 
example, p-*Sl4l Lemma 4.4]). 

It is known that one can reduce the problem of solving systems of equations in SDD matrices 
to either the special case of Laplacian matrices or SDDM matrices—the family of SDD matrices 
that are nonsingular and have non-positive off diagonal entries (see, e.g. [ST141 lCKM + 14] h We 
will usually consider SDDM matrices. Every SDDM matrix A can be uniquely written as a sum 
L + X where L is a Laplacian matrix and X is a nonnegative diagonal matrix. 

The main properties of SDDM matrices that we exploit are that they are closed under Schur 
complements and that they can be sparsified. The stongest known sparsifications come from the 
main result of [BSS12| . which implies the following. 

Theorem 3.5. For every n-dimensional SDDM matrix A and every e < 1, there is a SDDM 
matrix B having at most 10n/e 2 nonzero entries that is an e-approximation of A. In particular, 
the number of non-zero entries in B above the diagonal is at most 4.1 n/e 2 . 

While the matrix B guaranteed to exist by this theorem may be found in polynomials time, 
this is not fast enough for the algorithms we desire. So, we only use Theorem 13.51 to prove 
existence results. We later show how to replace it with faster algorithms, at some expense in 
the quality of the sparsifiers we produce. 


4 Block Cholesky Factorization 


Our algorithm uses block-Cholesky factorization to eliminate a block of vertices all at once. We 
now review how block-Cholesky factorization works. 

To begin, we remind the reader that Cholesky factorization is the natural way of performing 
Gaussian elimination on a symmetric matrix: by performing eliminations on rows and columns 
simultaneously, one preserves the symmetry of the matrix. The result of Cholesky factorization 
is a representation of a matrix M in the form U 1 U , where U is an upper-triangular matrix. 
We remark that this is usually written as LL 1 where L is lower-triangular. We have chosen to 
write it in terms of upper-triangular matrices so as to avoid confusion with the use of the letter 
L for Laplacian matrices. 

To produce matrices U with Is on their diagonals, and to avoid the computation of square 
roots, one often instead forms a factorization of the form U 1 D U, where D is a diagonal 
matrix. Block-Cholesky factorization forms a factorization of this form, but with D being a 
block-diagonal matrix. 

To begin, we must choose a set of rows to be eliminated. We will eliminate the same set 
of columns. For consistency with the notation used in the description of multigrid algorithms, 
we will let F (for finer) be the set of rows to be eliminated. We then let C (for coarse) be 
the remaining set of rows. In contrast with multigrid methods, we will have |F I < \C\. By 
re-arranging rows and colurns, we can write M in block form: 


M pp M pc 
Mqf Mcc 
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Elimination of the rows and columns in F corresponds to writing 


M = 


I 

O' 

M ff 0 

I M FF Mpc 

McfM ff 

I 

0 Mcc ~ McfM f 1 f Mfc_ 

0 I 


(1) 


Note that the left and right matrices are lower and upper triangular. The matrix in the lower- 
right block of the middle matrix is the Schur complement of F in M . We will refer to it often 
by the notation 

5c (AT, F) d = M CC ~ McfM-^Mfc. 

We remark that one can solve a linear system in Sc (M, F) by solving a system in M : one just 
needs to put zeros coordinates corresponding to F in the right-hand-side vector. 

Recall that 

t nl — 1 T t n~ 

(2) 

So, if we can quickly multiply by this last matrix, and if we can quickly solve linear systems in 
Mff and in the Schur complement, then we can quickly solve systems in M. Algebraically, we 
exploit the following identity: 

Fact 4.1. 


I 

O ' 

-l 

l-H 

O 

McfM ff 

I 


—M C fM ff I 


i _ 

I —M ff Mfc 

M FF 0 

i- 

o 

1_ 


0 I 

0 Sc(M, F) -1 _ 

_ -M C fM F p I 


(3) 


Our algorithms depend upon the following important property of Schur complements of 
SDDM matrices. 


Fact 4.2. If M is a SDDM matrix and F is a subset of its columns, the Schur complement 
Sc(M,F) is also a SDDM matrix. 

We now mention two other facts that we will use about the order and Schur complements. 
Fact 4.3. If Mff Mff, then 

f Mff Mfc\ f Mff Mfc\ 

\M C f Mcc) ^ \M C f Mcc) 

Fact 4.4 (Lemma B.l. from |MP13] h If M and M are positive semidefinite matrices satisfying 
M A M . then 

Sc (M,F) A Sc(M,F ) . 

The first idea that motivates our algorithms is that we can sparsify M and Sc(M , F). If 
M is sparse, then we can quickly multiply vectors by Mfc- However, to be able to quickly 
apply the factorization of M given in Fact 14.11 we also need to be able to quickly apply Mf F . 
If we can do that, then we can quickly solve systems in M by recursively solving systems in 
Sc (M, F). 

The easiest way to find an F for which we could quickly apply Mf}p would be to choose F 
to be a large independent set, in which case Mff would be diagonal. Such a set F must exist 
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as we can assume M is sparse. However, the independent set we are guaranteed to find by the 
sparsity of M is not big enough: if we repeatedly find large independent sets and then sparsify 
the resulting Schur complements, the error that accumulates could become too big. The second 
idea behind our algorithms is that we can find a large set F for which M pp is well-approximated 
by a diagonal matrix. This will allow us to apply M~p l F quickly. In the next section, we show 
that a very good choice of F always exists, and that the use of such sets F yields nearly-optimal 
algorithms for solving linear systems in M. 

In order to make the entire algorithm efficient, we are still left with the problem of quickly 
computing a sparsifier of the Schur complement. In Section [HI we show how to quickly compute 
and use Spectral Vertex Sparsifiers, which are sparsifiers of the Schur complement. In particular, 
we do this by expressing the Schur complement as the sum of the Schur complements of two 
simpler matrices: one with a diagonal FF block, and the other with a better conditioned FF 
block. We handle the matrix with the diagonal block directly, and the matrix with the better 
conditioned block recursively. 

5 A Polynomial Time Algorithm for Optimal Solver Chains 

Our algorithms will begin by eliminating a set of vertices F that is a-strongly diagonally domi¬ 
nant, a concept that we now define. 

Definition 5.1. A symmetric matrix M is a-strongly diagonally dominant if for all i 

Ma> (1 + at) l-Myl. 

We say that a subset F of the rows of a matrix M is a-strongly diagonally dominant if Mpp 
is an a-strongly diagonally dominant matrix. 

We remark that 0-strongly diagonal dominance coincides with the standard notion of weak 
diagonal dominance. In particular, Laplacian matrices are 0-strongly diagonally dominant. 

It is easy to find an a-strongly diagonally dominant subset containing at least an 1/8(1 + a) 
fraction of the rows of an SDD matrix: one need merely pick a random subset and then discard 
the rows that do not satisfy the condition. 

Pseudocode for computing such a subset is given in Figure [Q 

Lemma 5.2. For every n-dimensional SDD matrix M and every a > 0, SDDSubset computes 
an a-strongly diagonally dominant subset F of size at least n/ (8(1 + a)) in 0(m) expected work 
and O(logn) expected depth, where m is the number of nonzero entries in M. 

Proof. As F is a subset of F', 


\My\ < ^2 

So, when the algorithm does return a set F, it is guaranteed to be a-strongly diagonally domi¬ 
nant. 
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F = SDDSubset(M, a), where M is an n-dimensional SDD matrix. 

1. Let F' be a uniform random subset of {1 ,,n} of size 

2. Set 


F = < i € F' such that 


E 


Mij | < —— 
1 + a 


Mi 


3 ‘ If 1^1 < 8 ( 14^1 ’ g0t ° Ste PE 
4. Return F 


Figure 1: Routine for Generating an a-strongly diagonally dominant subset F 


We now show that the probability that the algorithm finishes in each iteration is at least 
1/2. Let Ai be the event that i € F' and that i 0 F. This only happens if i € F' and 


E i M « 


> 


1 


1 Oi 


I M r 


(4) 


jS-F'j+i 

The set F is exactly the set of i € F' for which Ai does not hold. 
Given that i € F ', the probability that each other j / j is in F' is 

1/n 


n—1 \4(1 + a) 


- 1 


So, 

E 


E i m «i 

j£F',j^i 


i € F' 


< 


1 


77 


1 ) V|M i7 | < —--V|M i7 -| < —--I Mu 

\ wl 4(1 + a) 3 — 4(1 + a) 

■ 7 ^* 3^1 


77 — 1 \4(1 + a) 

as M is strongly diagonally dominant. So, Markov’s inequality tells us that 

1 


Pr 


E i M « 


> 


j£F',j^i 


1 + a 


Mi 


i € F' 


<1/4, 


and thus 


Pr [Ai] = Pr [i € F'] Pr [i 0 F\i € F'] < 


1 1 


1 


4(1 +a) 4 16(1 +a)' 

Again applying Markov’s inequality allows us to conclude 


Pr 


|{* : A i}\ > 


77 


< 1 / 2 . 


8(1 + a) 

So, with probability at least 1/2, |F| > 77 / 8(1 + a), and the algorithm will pass the test 
in line 3. Thus, the expected number of iterations made by the algorithm is at most 2. The 
claimed bounds on the expected work and depth of the algorithm follow. □ 

























Strongly diagonally dominant subsets are useful because linear systems involving them can 

(k) 

be solved rapidly. Given such a set F, we will construct an operator Z FF that approximates 
Mff F and that can be applied quickly. To motivate our construction, observe that if Mff = 
X ff + Lff where X ff is a nonnegative diagonal matrix and Lff is a Laplacian, then 


M 


FF 


= Xp l F - X 


-1 


ff^ffX ff + 1 ) 

i> 2 




'^wr 1 


■ FF) 


We will approximate this series by its first few terms: 

zf F d ^ f X] *ff {-LffX-^Y . (5) 

i =0 

In the following lemmas, we show that using Zff in place of M FF in (J3J) provides a good 
approximation of Af _1 . We begin by pointing out that Xff is much greater than Lff■ In 
particular, this implies that all diagonal entries of Xff are positive, so that Xf l F actually 
exists. 


Lemma 5.3. Let M be a SDDM matrix that is a-strongly diagonally dominant. Write M = 
X + L where X is a nonnegative diagonal matrix and L is a Laplacian. Then, 


Proof. Write L = Y — A where Y is diagonal and A has zero diagonal. As L is diagonally 
dominant, so is Y + A. This implies that Y )p — A, and so 2 7 L. 

As M is a-strongly diagonally dominant and the diagonal of M is X + Y, 

((X +Y)l) i >(a + l)(Al) i . 


As L is a Laplacian, LI = 0, which implies Y 1 = A1 and 


(Xl) i >a(Al) i = a(Yl) i . 


As both X and Y are diagonal, this implies that 

X^aY^ 


□ 


We now bound the quality of approximation of the power series ([5]). 


Lemma 5.4. Let M be a SDDM matrix and let F be a set of columns so that when we write 
Mff = Xff + Lff with Xff nonnegative diagonal and Lff o Laplacian, we have Lff 
fdX ff- Then, for odd k and for Z FF as defined in (joj we have: 

Xff + L F f P (Z FF ) 1 < X FF + (1 + S) L ff , (6) 


where 


5 


pk 


1 + /3 

1 - ( 3 k+1 ' 
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(k) 

Proof. The left-hand inequlity is equivalent to the statement that all the eigenvalues of Z FF {X ff+ 
Lff ) are at most 1 (see [BGH+OBI Lemma 2.2] or 1ST' 1 1 Proposition 3.3]). To see that this is 
the case, expand 

zf F (X FF + L ff ) = (X ff + L ff ) 


, i=0 


fc +1 


^(-XffLffY - ^ X ff l ffT 


i =0 


i =1 
fc+1 


= Iff — (X ff Lff) 

As all the eigenvalues of an even power of a matrix are nonnegative, all of the eigenvalues of 
this last matrix are at most 1. 

Similarly, the other inequality is equivalent to the assertion that all of the eigenvalues of 
Zpp(X FF + (1 + 5)L ff ) are at least one. Expanding this product yields 

(^XjYpi-LFFX^pY j (X FF + (1 + 6)L ff ) 

k 

= I FF ~ (X^Lff)^ 1 + Sj2(-^y(X F 1 F LFFy +1 


, i=0 


i =0 


The eigenvalues of this matrix are precisely the numbers 

k 


l- x k+1 + d^2(-iyx i+ \ 


(7) 


i=0 


where A ranges over the eigenvalues of X ff Lff. The assumption L FF =4 /3X FF implies that 
the eigenvalues of X FF L FF are at most /5, so 0 < A < /3. We have chosen the value of <5 precisely 
to guarantee that, under this condition on A, the value of (0 is at least 1. □ 

We remark that this power series is identical to the Jacobi iteration for solving linear systems. 

(k) 

The following lemma allows us to extend the approximation of M F f by the inverse of Z y F ’ F 
to the entire matrix M. 

Lemma 5.5. Under the conditions of Lemma \5.4\ and assuming that 0 < /3 < 1/2, 

-l 


M ^ 


y(X) 

“ FF 

Mcf 


M FC | 
Mcc) 


4 (1 + 2(3 k )M. 


Proof. The left-hand inequality follows immediately from Fact 14.31 and the left-hand side of ([6|). 
To prove the right-hand inequality we apply Fact 14.31 and the right-hand side of 0 to conclude 


{z'pj^j M F c\ ^ ( M FF + dL FF M FC \ =M + s fL F F 


0 


M cf M cc " V Mcf 


M CC 


Vo 0 
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Consider the (unique) decomposition of M into L + X where L is a graph Laplacian. When 
viewed as graphs, Lpp is a subgraph L , which means: 



M, 


by which we may conclude that 

M + S { L q F + 


To finish the proof, recall that 5 = /3 k (1 +/3) / (1 — (3 k+1 ) and observe that for k > 1 and (3 < 1/2, 
5 < 2 (3 k . □ 

i if ^*0 

We now show that we can obtain a good approximation of by replacing M FF by Z FF 

in the three places in which it explicitly appears in ©, but not in the Schur complement. 


Lemma 5.6. Let M be a SDDM matrix and let F be an a-diagonally dominant set of columns 
for some a > 4. Then, for k odd and Z ^ as defined in (J5|). 


I — Z FF M FC 


zfp 0 


1 

O 

1_ 

0 I 


0 Sc (M, F)~~ l 


—M CF Z ff I 


for 7 = 2(2 /a) k . 


Proof. Define 


M = 


(z^r 1 m fc 

Mcf Mcc 


Lemma 15.31 tells us that M satisfies the conditions of Lemma 15.41 with /3 
Lemma 15.51 implies 

M + M + (1 + 7 ) M. 

By facts 14.11 and 13.11 this implies 


2/a. So, 


M _1 + 


I - Zp F M F c 

0 I 


r{k) 

ipp 


0 


0 Sc ( M,F 


-1 


I 0 

-M CF zf F I 


> (1 + 7 ) - 1 M- 1 . 


From Facts 1+41 and eh we know that 

Sc(M,F)~ 1 + Sc^M^y 1 + (I + 7 )- 1 5c(M,T )' 1 . 

When we use Fact 13.21 to substitute this inequality into the one above, we obtain 

( 1 + 7 )M - 1 + 

which implies the lemma. □ 


I -Z%Mpc 


zfp 0 


1 

O 

1 _ 

0 I 


0 Sc(M, F)- 1 


—M CF Z ff I _ 


+ ( 1 + 7 ) _1 M 
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We now use Lemma l5.6l to analyze a solver obtained by iteratively sparsifying Schur comple¬ 
ments of strongly diagonally dominant subsets. We refer to the sequence of subsets and matrices 
obtained as a vertex sparsifer chain, as an approximation of a Schur complement is a spectral 
vertex sparsifier. In the following definition, M ^ is intended to be a sparse approximation of 
The sparsity of the matrices will show up in the analysis of the runtime, but not in the 
definition of the chain. 

Definition 5.7 (Vertex Sparsifier chain). For any SDDM matrix IW (0 \ a vertex sparsifier chain 
of M ® with parameters aj > 4 and 1/2 > e* > Oisa sequence of matrices and subsets 
(M (1) ,..., M (d) ; Fi,.. ., F d _ i) such that: 

1. MW « eo m(°), 

2. M^ +1 ) » e . Sc(m®,fX 

(i) 

3. M f[f\ is ctj-strongly diagonally dominant and 

4. M ^ has size 0(1). 

We present pseudocode that uses a vertex sparsifier chain to approximately solve a system 
of equations in M in Figure [2j We analyze the running time and accuracy of this algorithm 
in Lemma 15.81 


Z (1) = ApplyChain(M (1) , ..., M [d) ,Fi ,..., F d _ x ,a\ ... a d -i, e 0 ... e d -i, h (1) ) 

1. For i = 1,..., d — 1 

(a) let ki be the smallest odd integer greater than or equal to log Qi/ / 2 (2/e*). 

(b) x F <— Zp'p.bf)., where Zp'p. is obtained from M FF , as in Q. 

(c) !><«> 4- - M^IpFI 

2. xM t- 1 ftW. 

3. For i = d — 1,..., 1 

(a) £c^ ■(— a/* -1-1 ). 

o>) »g - zlK.' 1 *"' 


Figure 2: Solver Algorithm using Vertex Sparsifier Chain 


Lemma 5.8. Given a vertex sparsifier chain where has m,i non-zero entries, the al- 

gorithm ApplyChain(M^, ..., M^ d \ F \,..., F d -\, an ... a d ~i, eo • • • £d-i, b) corresponds to a 
linear operator W acting on b such that 
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1. 


W " 1 M(°), 

2^i =o ze * 

and 

2. for any vector b, ApplyChain(M^), ..., M^ d \Fi, ..., F^-i, a\ ■ ■ ■ otd-ii e o ■ ■ ■ e -d-i- b) runs 
in O (j2i=i (log Qi ( e * rl ) logn) j depth and O (j2i=\ (log aj (e^ 1 )) m^j work. 

Proof. We begin by observing that the output vector is a linear transformation of the input 
vector Let be the matrix that realizes this transformation. Similarly, for 2 < i < d, 

define to be the matrix so that 


xW = W {i) b^. 

An examination of the algorithm reveals that 

w ( d ) = 


and 


’ 1 -ZpX M F iCi 


' o 


i 

O 

i _ 

0 I 


0 w^ l+1) 


-m^fXf *:k 1 


W® = 

We will now prove by backwards induction on i that 

-l 


( 8 ) 

(9) 




M«>. 


The base case of i = d follows from l{8|). When we substitute our choice of ki from line la of 
ApplyChain into Lemma [5761 we find that 


I 

z - 1 X'-'X 


r{ki) 

'FiFi 


0 


0 Sc(M^,Fi 


I 0 

X ± X ± X 


o: (M«) 


-1 


As Wg. Sc(MW,F t j ; 


I -Z%M%. 

L X 1 - X -*■ X'-'X 

0 I 


r(ki) 

'F^ 


0 


-i 


K -'X ± X L X 1 - X 


' / 2 e 7 ; 


M w 


By combining this identity with (J9|) and our inductive hypothesis, we obtain 

-l 


Finally, as M (0 ^ « eo 


W (i) 

2^j=i ze J 


w {1) 

2^>j=0 ze 3 


(m®)’ 


(m (0) ) 


-1 
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To bound the work and depth of the algorithm, we observe that we do not need to construct 
the matrices Zf’f explicitly. Rather, we multiply vectors by the matrices by performing ki = 
0 (log Q . (e) -1 )) matrix-vector products by the submatrices of that appear in the expression 
©• As each matrix-vector product can be performed in depth O(logn), the depth of the whole 
algorithm is bounded by 0((log n) JV ki). As each matrix M^ 1 ’ has m; non-zero entries, and 
the work of the ith iteration is dominated by the cost of multiplying by submatrices of 
0(ki) times, the total work of the algorithm is 0(Yi=i m iki). □ 


Definition 5.9 (Work and Depth of a Vertex Sparsifier chain). An e-vertex sparsifier chain 
of an SDDM matrix of depth D and work W is a vertex sparsifer chain of with 

parameters on > 4 and 1/2 > e > 0 that satisfies 

1- 2 Yli= o e i — e i 

2. Y d r{ rrii log a . e f 1 < W, where m* is the number of nonzeros in and 

3. Yi= 'i(logn) log a; e i _1 — D, where n is the dimension of 

Theorem 5.10. Every SDDM matrix M of dimension n has a 1-vertex sparsifier chain of depth 
O (log 2 n log log re) and work 0(n). Given such vertex sparsifier chain, for any vector b, we can 
compute ane approximate solution to M~ 1 b in 0{m log(l/e)) work and 0 (log 2 nloglognlog(l/e)) 
depth. 


Proof. We will show the existence of such a vertex sparsifier chain with a* = 4 for all i and 
d = 2 (i+ 2 )' 2 ' Lemma l5.2l tells us that every SDDM matrix has a 4-strongly diagonally dominant 
subset consisting of at least a 1/8(1+ 4) = 1/40 fraction of its columns. By taking such a subset, 
we ensure that the number of vertices of which we define to be n*, satisfies 


rii < 



n. 


In particular, this means that d, the number of matrices in the chain, will be logarithmic in n. 

If we use Theorem 13. 5l to find a matrix Af ■ 1 ^ that is an eo approximation of = M , and 
to find a matrix that is an approximation of Sc T)'), then each matrix 

will have a number of nonzero entries satisfying 

m < Oirii/e 2 ^) < O [i + l) 4 n 

Lemma 15.81 tell us that the vertex sparsifier chain induces a linear operator that is an e- 
approximation of the inverse of M , where 



d -1 


d -1 


£ <2]>><2]r 


i =0 


,rS 2 (i + 2) 2 


< 


E; 

i> 2 


2 < 1 . 


To compute the work and depth of the chain, recall that we set ki to be the smallest odd 
integer that is at least log a ./ 2 e~ , so ki < O(logi). Thus, the work of the chain is at most 


d 

E 

i— 1 


kiUii < O E log(i) 


v i= 1 


i— 1 




(i + l) 4 n ) < O [ Y, ( i 5 n J < 0(n). 


i— 1 
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Similarly, the depth of the chain is at most 


d / d \ 

^(log n)ki < O ^^(logn)logd < 0 (log 2 n loglogn). 

i =1 \i=l / 


□ 


6 Linear sized U T DU approximations 


We now show that the vertex sparsifier chains of M from the previous section can be used to 
construct Cholesky factorizations of matrices that are 2-approxinrations of M. In particular, we 
prove that for every SDDM matrix M of dimension n there exists a diagonal matrix D and an 
upper-triangular matrix U having O(n) nonzero entries such that U 1 D U is a 2-approximation 
of M. 

The obstacle to obtaining such a factorization is that it does not allow us to multiply a 
vector by Z^ FF in many steps. Rather, we must explicitly construct the matrices Z^ FF . If we 
directly apply the construction suggested in the previous section, these matrices could be dense 
and thereby result in a matrix U with too many nonzero entries. To get around this problem, 

we show that we can always find strongly diagonally dominant subsets in which all the vertices 

(k) 

have low degree. This will ensure that all of the matrices Z FF are sparse. 

Lemma 6.1. For every n-dimensional SDD matrix M and every a > 0, there is an a-strongly 
diagonally dominant subset of columns F of size at least 16 ({Ya) su °h that the number of nonzeros 
in every column F is at most twice the average number of nonzeros in columns of M. 

Proof. Discard every column of M that has more than twice the average number of nonzeros 
per column. Then remove the corresponding rows. The remaining matrix has dimension at least 
n/2. Use Lemma 15.21 to find an a-strongly diagonally subset of the columns of this matrix. □ 

To obtain a U 1 D U factorization from a vertex sparsifier chain, we employ the procedure 
in Figure [3l 

Lemma 6.2. On input a vertex sparsifier chain of M with parameters a^ > 4 and > 0, the 
algorithm Decompose produces matrices D and U such that 

U t DU « 7 M, 


where 


d -1 

7 < 2 ej + 4/ min a*. 

z —' i 

2—0 


Proof. Consider the inverse of the operator W = realized by the algorithm ApplyChain, 

and the operators W® that appear in the proof of Lemma 15.81 
We have 




M c .f Z { F l 

i x Jr iJr i 


0 

I 


r(k>i) 

'FiFi 


-1 


0 

vu(* +1 ) 


I 

0 


4‘f, m ”.c, 
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(D,U) = Decompose ..., M^ d \ F\..... F^-iJ , where each AfW is a SDDM matrix. 

1 . let ki be the smallest odd integer greater than or equal to log a ./ 2 • 

2. For each i < d, write where is a positive diagonal matrix and L® 

is a Laplacian. 

3. Let X 1 '' 1 ' 1 = Icd-i an d let U be the upper-triangular Cholesky factor of M^ d \ 

4. Let D be the diagonal matrix with Dp i p i = Xi, for 1 < i < d, and Dc d _ 1 c d _ l = Lc d _i- 

5. Let U be the upper-triangular matrix with Is on the diagonal, U c d _ 1 Cd~i = U, and 

U Fi a = for 1 <i<d. 


Figure 3: Converting a vertex sparsifer chain into U and D. 


W (d '>) 1 = M {d) = U : U. 


and 


After expanding and multiplying the matrices in this recursive factorization, we obtain 

-l 


(V(b) 


-1 


= u 1 


r(fcl) 

, F l F 1 
0 
0 
0 


0 

0 

r(kd— l) 
0 

^d- 1 


-1 


0 

0 

0 

f'c , d _ 1 c d _ 1 . 


u. 


Moreover, we know that this latter matrix is a 2 approximation of M. It remains to 

determine the impact of replacing the matrix in the middle of this expression with D. 

It suffices to examine how well each matrix ^ Zp'p^j is approximated by X ^. From 

Lemma 15.31 we know that 

X® ^ (oi/2)ZlW. 

Thus, we may use Lemma 15.41 with j3 = cti/2 to conclude that 

-i 




"A /OLi 


r(ki) 

'FiFi 


This implies that replacing each of the matrices ^ Z ^ by X^ increases the approximation 
factor by at most 4/ min* ccj. □ 

Using this decomposition procedure in a way similar to Theorem 15.101 but with subsets 
chosen using Lemma 16. II gives the linear sized decomposition. 
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Theorem 6.3. For every n-dimensional SDDM matrix M there exists a diagonal matrix D 
and an upper triangular matrix U with 0(n ) nonzero entries so that 

U t DU «2 M. 


Moreover, back and forward solves in U can be performed with linear work in depth 0(log 2 n). 

Proof. We choose the same parameters as were used in the proof of Theorem 15.101 a* = 4 for 
all i and e* = 1/2 (i + 2) 2 . Theorem 13.51 then guarantees that the average number of nonzero 
entries in each column of M^ is at most 10/e 2 = 40(i + l) 4 . If we now apply Lemma [6.1 1 to find 
4-diagonally dominant subsets 1/ of each we find that each such subset contains at least a 

1/80 fraction of the columns of its matrix and that each column and row of M'- 1 ' indexed by F 
has at most 80(i + l ) 4 nonzero entries. This implies that each row of Zp'p, Mp i c i has at most 
(80(i + l ) 4 ) fci+1 nonzero entries. 

Let Hi denote the dimension of By induction, we know that 


rii < n 



So, the total number of nonzero entries in U is at most 


£>(80 (i + l)y- +1 < ng (l - 4)‘ 1 (80(i + 1) 4 )‘- +1 . 

We will show that the term multiplying n in this later expression is upper bounded by a constant. 
To see this, note that ki < 1 + log Q ,. / / 2 (2e^ 1 ) < zzlog(i + 1) for some constant v. So, there is 
some other constant p. for which 

(80(i + l ) 4 ) fci+1 < exp(/ilog 2 (i + 1 )). 


This implies that the sum is at most 

^2 exp(/r log 2 (i + 1 ) - i/80), 

i> 1 


which is bounded by a constant. 

The claimed bound on the work to perform backwards and forwards substitution with U 
is standard: these operations require work linear in the number of nonzero entries of U. The 
bound on the depth follows from the fact that the substitions can be performed blockwise, take 
depth O(logn) for each block, and the number of blocks, d, is logarithmic in n. □ 

7 Existence of Linear Work and 0(log n log 2 log n) depth Solvers 

The factorizations constructed in the previous section can be evaluated in 0(log 2 n) depth and 
0{n) work. One O(logn) factor comes from the depth of the recursion and another O(logn) 
factor comes from the depth of matrix vector multiplication. The reason that matrix-vector 
multiplication can take logarithmic depth is that computing the sum of k numbers takes 0 (log k) 
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depth. Thus, if we can instead multiply by matrices with k °^ nonzeros in each row and column, 
for some small k, we can reduce the depth of each matrix-vector multiplication to O(logfc). 

Although the number of non-zeros in each row of Mp i c i is bounded by (80(z + l) 4 ) fci+1 , 
the number of non-zeros per column can be high. This is because although we picked fy to be 
of bounded degree, many of those vertices can be adjacent to a few vertices in Cj. For the 
factorization constructed in Section [Gj k can be as large as n. In this section, we reduce this 
degree to log 0 ^ n by splitting high degree vertices. This leads a factorization that can be 
evaluated in linear work and 0 (lognlog 1 2 logn) depth. 


7.1 Splitting High Degree Vertices 

While sparsification produces graphs with few edges, it does not guarantee that every vertex has 
low degree. We will approximate an arbitrary graph by one of bounded degree by splitting each 
high degree vertex into many vertices. The edges that were originally attached to that vertex 
will be partitioned among the vertices into which it is split. The vertices into which it is split 
will then be connected by a complete graph, or an expander if the complete graph would have 
too high degree. The resulting bounded-degree graph has more vertices. To approximate the 
original graph, we take a Schur complement of the bounded-degree graph with respect to the 
extra vertices. We recall that one can solve a system of equations in a Schur complement of a 
matrix by solving one equation in the original matri:x@ 

We begin our analysis by examining what happens when we split an individual vertex. 


Lemma 7.1. Let G be a weighted star graph with vertex set {iq, ... ,v n ,u} and edges connecting 
u to each m with weight wi. Let G be a graph with vertex set {tq, ... ,v n ,u\, ... , tq,} in which 
the vertices {u\,..., u *,} are connected by a complete graph of edges of weight W = <5 _1 Wi, 
and each vertex ly is connected to exactly one vertex Uj, again by an edge of weight w^. Let 

U = {u 2 , ■ ■ ■, Uk\- Then, Sc(g,U^J =4 G, and in Sc(g,U^J the edge between u\ and iq has 
weight at least uq(l — 25), for every i. 

Proof. We will examine the Laplacian matrices of G and G. Define wtot — Wi, so W = wtot/5. 
Let b be the vector of weights uq,..., w n , and let B be the diagonal matrix of b, so that so that 


Lg = 




Similarly, let C be the adjacency matrix between v\,...,v n and u ±,..., Uk, and let D be the 
diagonal matrix whose jth entry is the sum of the Wi for which Vi is connected to uj. Then, 

L~=( B \ 

G ^_c T D + W(kl k — J k)J 

where J ^ is the k x k all ones matrix and kl j, — J j, is the Laplacian of the complete graph on 
k vertices. 

1 To solve a system Sc(M, S) x = b, where S is the last set of rows of M, one need merely solve the system 

Mx = b, where b is the same as b but has zeros appended for the coordinates in S. The vector x is then obtained 
by simply ignoring the coordinates of x in S. 
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To express the Schur complement, let D 2 be the submatrix of D obtained by excluding its 
first row and column, let C 2 be the submatrix of C excluding its first column, and let c\ be 
the first column of C . Let c 2 = C 2 1, so b = C\ + c 2 . We then have that Sc (Lq, U^j equals 

( B cT -wi) 

= (-cf DfM)) “ (wi T ) (° 2 + - Jk -,))- 1 * * * (c 2 Wl ) 

To understand this expression, we will show that it approaches Lq as 5 goes to zero. We first 
note that 

( klk-i — J k- 1) = 1 + J k— l)j 

and so 

CKkl^ - J fc_i) -1 l =C t 2 1 = cl. 

So, the last row and column of the Schur complement agrees with Lq as 5 goes to zero. On the 
other hand, the upper-left block becomes 

B - Cl (D 2 + W(kl fc _! - Jk-i))- 1 C 2 , 


which goes to B as 5 goes to zero. 

To bound the discrepancy in terms of 5, we recall that J = ll r , and so we can use the 
Sherman-Morrison-Woodbury formula to compute 


(D 2 + W(kl k _ 1 - Jfc.r ))- 1 


(D 2 + WkL k _ 1 )~ 1 + 


(d 2 + wu ^r 1 wj k . 1 (d 2 + wki^y 1 

1 - W1 T (D 2 + Wklk-i )- 1 1 


Note that D 2 + Wkl^_i is a diagonal matrix. As all entries of D 2 are less than 5W, every 
diagonal entry of this matrix is at least (Wk( 1 + So, we have the entry-wise inequality 


(D 2 + WkL^y 1 WJ k —\ (P 2 + Wkl k -r)- 1 ^ WJ,._i/(Wfc(l + (5)) 2 _ 1 

1 - W1 T (D 2 + WkL k _ 1 y 1 l ~ 1 /k Wk(l + 5 ) 2 k ~ V 

This tells us that, entry-wise, 

(D 2 + W(kl fc _! - Jk-i))- 1 > (1 - 2<5)^ + J k - 1 ) = (1 - 25)(W(L/ fc _ 1 - Jfc-r)) -1 - 

The claimed bound on the entries in row and column corresponding to u\ of the Schur com¬ 
plement now follows from the fact that they are obtained by multiplying this matrix inverse 
on either side by C 2 and Wl: as these are non-negative matrices, the entry-wise inequality 
propogates to the product. □ 


The following theorem states the approximation we obtain if we split all the vertices of high 
degree and connect the clones of each vertex by expanders. 
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Theorem 7.2. For any graph G = (V, E) with n vertices, £ > 0 and t > 1/e 2 , there is a graph 
G = (V U S, E) of maximum degree 0(t) such that 

G« 6 Sc(g,s), (10) 


\S\ =0(n/(£ 2 t)), and E < 0(n/e 2 + n/(eH)). 


Proof. We first sparsify G using Theorem 13.51 obtaining G with 0(n/e 2 ) edges such that G « e / 3 

G. 

Let U be the set of vertices in G of degree more than t. We will split each vertex in U into 
many vertices. For each u € U, let d u be its degree in G. We split u into \d u /t] vertices, one 
of which we identify with the original vertex u, and the rest of which we put in S. We then 
partition the edges that were attached to u among these \d u /t\ vertices, so that each is now 
attached to at most t of these edges. We then place a complete graph between all of the vertices 
derived from u in which every edge has weight equal to the sum of the weights of edges attached 
to u, times 12/e. That is, we apply the construction of Lemma 17.11 with 5 = e/3. Call the 
resulting graph G'. 

If \d u /t\ > t, we replace that complete graph by a weighted expander of degree 0(l/e 2 ) that 
is an e/3 approximation of this weighted complete graph, as guaranteed to exist by Lemma I A. 9 1 
The resulting graph is G. 

To show that (11011 holds, we first show that 


G~ e/ 3 Sc(G',S). 


Lemma o tells us that Sc(G',S ) G. It also tells us that the graph looks like G except 
that it can have some extra edges and that the edges attached to vertices we split can have 
a slightly lower weight. If an edge is attached to just one of the split vertices, its weight can 
be lower by a factor of 2<5 = e/ 6 . However, some edges could be attached to two of the split 
vertices, in which case they could have weight that is lower by a factor of e/3. This implies that 
(1 — e/3 )G =4 Sc(G',S). To prove (flUl) . we now combine this with the factors of e/3 that we 
loose by sparsifying at the start and by replacing with expanders at the end. 

It is clear that every vertex in G has degree at most t + 0(l/e 2 ). To bound the number of 
edges in G, we observe that the sum of the degrees of vertices that are split is at most 0(n/e 2 ), 
and so the number of extra vertices in S is at most 0{n/e 2 t). Our process of adding expanders 
at the end can create at most 0 (l/e 2 ) new edges for each of these vertices, giving a total of at 
most 0 (n/e 4 t) new edges. □ 

Remark 7.3. We do not presently know how to implement the exact construction from the 
above theorem in polynomial time, because it relies on the nonconstructive proof of the existance 
of expanders from [MSS15] . One can transform this into a polynomial time construction by 
instead using the explicit constructions of Ramanujan graphs | iMar881 LPS 88 ] as described in 
Lemma lA.81 This would, however, add the requirement t > 1/e 6 to Theorem 17.21 While this 
would make Theorern 1 7.2 1 less appealing, it does not alter the statement of Theorem 17.41 


It remains to incorporate this degree reduction routine into the solver construction. Since our 
goal is to upper-bound the degree by 0(log c n) for some constant c, we can pick t in Theorem [72] 
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so that e 2 t < log 0 ^ n. This leads to a negligible increase in vertex count at each step. So we 
can use a construction similar to Theorem 16.31 to obtain the lower depth solver algorithm. 

Theorem 7.4. For every n-dimensional SDDM matrix M there a linear operator Z such that 

Z « 2 M' 1 

and matrix-vector multiplications in Z can be done in linear work and 0(lognlog 2 logn) depth. 
Furthermore, this operator can be obtained via a diagonal D, an upper triangular matrix U with 
0{n ) non-zero entries and a set of vertices V such that 

M » 2 Sc(u t DU,V ) . 

Proof. We will slightly modify the vertex sparsification chain from Definition l5.7l Once again, we 
utilize Oii = 4 for all i and e* = 1/2 (i + 2) 2 . The main difference is that instead of using spectral 
sparsifiers from Theorem 13.51 directly, we use Theorem 17.21 to control the degrees. Specifically 
we invoke it with e = q and ti = 200e^ 2 on Sc to obtain and Si + i s.t. 

Sc(M®,F^ « e . 5c(M( i+1 ),5 i+1 ) . 

This leads to a slightly modified version of the vertex sparsifier chain. We obtain a sequence 
of matrices M i, M 2 .. .and subsets Si and Fi s.t. 

a. MW « eo M^\ 

b. Sc[m^ 1+1 \s 1+1 ) 5c(m«,t;), 

c. Each row and column of has at most t non-zeros. 

d. Each column and row of M 1 ' 1 ' 1 indexed by Fi has at most 80(i + l) 4 nonzero entries, 
(obtained by combining the bound on non-zeros from Theorem 17.21 with Lemma 16.11 

(j\ 

e ' M FiFi is 4-strongly diagonally dominant and 
f. M^ has size 0(1). 

This modified chain can be invoked in a way analogous to the vertex sparsifier chain. At 
each step we 

1. Apply a recursively computed approximation to (Mp\ F )^ x on b^ to obtain x^ l+l \ 

2. Pad aO +1 ) with zeros on Si to obtain 

3. Repeat on level i + 1 

4. Restrict the solution x^ +l ^ to obtain x^ +1 "> 

5. Apply a recursively computed approximation to (Mp.p.) -1 to x^ +1 ^ to obtain x^. 
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Let rii denote the dimension of Af®. Since t was set to 2006^ 2 , the increase in vertex size 
given by 5^ is at most: 

(l + jij) < (l - jL) 

By induction this gives 

( i y- 1 

rii < n 1 —-— 

V 400 / 

So the total work follows in a way analogous to Theorem 15.101 and it remains to bound depth. 

The constant factor reduction in vertex count gives a bound on chain length of d = O(logn). 
This in turn implies t = 0(e~ 2 ) = 0(log 4 n). Therefore the depth of each matrix-vector multi¬ 
plication by MFiCi is bounded by O (log log n). Also, choosing k{ as in Theorem 16.31 gives that 
the number of non-zeros in Z^pp is bounded by (logn)°i loglogri i, giving a depth of 0(log 2 logn) 
for each matrix-vector multiplication involving Z FiFi . The O(logn) bound on d then gives a 
bound on the total depth of 0(lognlog 2 logn). 

This algorithm can also be viewed as a linear operator corresponding to a U 1 D U factor- 

~ (i+l 

ization of a larger matrix. We will construct the operators inductively. Suppose we have D , 
and l/h+ 1 ) such that 




'2 £?,=,. 


5c U 


rh+iA 




D 


(<+l) 




An argument similar to that in the proof of Lemma 16.21 gives 

r T nil 7^1) ’ 1 

M«> 


U ^iCi 


0 

I 


7 (^ 1 ) \ 
Z F 1 F 1 ) 


0 


Sc 


0 

(m«, 


F, 


I u FiCi 

0 I 


Consider the entry Sc F^j . 

pothesis and Fact 14.41 gives 

5c(m« « e . Sc (m( 


Combining condition b of the chain with the inductive hy- 



e i + E;' : 


=i+l 


Sc (sc((uW) T D [l+l) ,S i+ 


Since the order by which we remove vertices when taking Schur complements does not matter, 
we can set 

v® = y (l+1) u5 i+ i, 

to obtain 


Sc 




e i + £i'=j+1 

Block-substituting this and using Fact 14.31 then gives: 


5c((V i+1 )) D (l+1) U^ +1 \V^ 


AfW „ 

2 Fi' =i £i' 


I 0 

tT 

FiCi 


Ulr, I 


r(k 1 ) 

' F\F\ 


-1 


ScUu^Y D {i+1) U^ i+1 \V^ 


I U FiCi 

0 / 


0 
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We will show in Lemma 17.51 that the Schur complement operation can be taken outside multi¬ 
plications by U 1 and U. This allows us to rearrange the right-hand side into: 


Sc 


I 0 0 

U T F iCi I 0 
0 0 Iy(i) _ 

Hence choosing 


r(fcl) 

, F 1 F 1 

0 


^ (i) 

D = 


-l 


0 


(U {i+1) ) T D {i+1) [7 (i+1) 


I u FiCi 


0 0 


0 
0 

yW - 




'FiFi 

0 


-1 


D 


0 

(<+i) 


and 


17 W = 


" / 0 

0 £/ (m) 


' I 

U FiCi 

0 


I 

U FiCi 

0 


1 

o o 

I 

0 

1> ° 

1_ 

— 

1 

O O 

I 

0 

0 

[/h+1) 


gives ~2J2 d i .£•/ U^,V^'\ , and the inductive hypothesis holds for i as 


well. 


i(0) 


We then finish the proof as in Lemma [6. 2 1 bv replacing D with a matrix D whose diagonals 
contain instead of 

□ 


It remains to show the needed Lemma rearanging the order of taking Schur complements. 
Lemma 7.5. Let P be an arbitrary matrix, and M = Sc Then 

1 T 


P t MP = Sc 


P 0 
0 I 


v 


M 


P 0 
0 I 


V J 


,V 


Proof. Let the rows and columns of M be indexed by V. It suffices to show that the matrix 


P 0 
0 I 


1 T 


V 


M 


P 0 
0 I 


v 


-l 


vv 


is the same as (P 1 MP) L This matrix can be written as: 


. -l 


1 

o 

_ — _ 

r p T o i 

. o i 9 

M 

>-l 

o 


The top left block corresponding to V gives 




M 


--1 


vv 


The definition of Schur complements gives M 1 = ( M 


vv 


, which completes the proof. □ 
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8 Spectral Vertex Sparsification Algorithm 

In this section, we give a nearly-linear work algorithm for computing spectral vertex sparsifiers. 
Recall that our goal is to approximate the matrix 

Sc (At, F) = Mcc ~ McFMppM fc- 

Our algorithm approximates in a way analogous to the recent parallel solver by Peng 

and Spielman |PS14j . It repeatedly writes the Schur complement as the average of the Schur 
complements of two matrices. The FF block in one of these is diagonal, which makes its 
construction easy. The other matrix is more strictly diagonally dominant than the previous one, 
so that after a small number of iterations we can approximate it by a diagonal matrix. 


8.1 Spliting of Schur complement 

This spliting of the Schur complement is based on the following identity from [PS14| : 


[D-A)~ l = 


2 L 


D 1 + (1 + D~ l A) (D - AD- 1 A) 1 (i + AD- 1 ) 


( 11 ) 


We write Mff = Dpp — App where Dpp is diagonal and App has zero diagonal, and apply 
(1111) to obtain the following expression for the Schur complement. 

5c (M,F) = ~ [2 M CC - McpDfpMpc 

—Mqf ( Iff + DppApp) (Dpp — AppDfipApp ) (I + AppDfip ) Mpc ■ 

( 12 ) 


Our key observation is that this is the average of the Schur complement of two simpler matrices. 
The first term is the Schur complement of: 


Dpf Mpc 

M cf 0 


while the second term is the Schur complement of the matrix: 

Dff — AffDppAff (/ + AffDpp ) Mfc 

Mcf (I + DppApp ) 2 Mcc 

This leads to a recursion similar to that used in jPS14| . However, to ensure that the Schur 
complements of both matrices are SDDM, we move some of the diagonal from the CC block of 
the second matrix to the CC block of the first. To describe this precisely, we use the notation 
diag(:e) to indicate the diagonal matrix whose entries are given by the vector x. We also let 1 
denote the all-ones vector. So, diag(s)1 = x. 


Lemma 8.1. Let M be a SDDM matrix, and let ( F , C ) be an arbitrary partition of its columns. 
Let Mff = Dpp — App, where Dpp is a diagonal matrix and App is a nonnegative matrix 
with zero diagonal. Define the matrices: 


Mi 


def 


Dpp M pc 

Mcf cFDfpMpclc) 


(13) 


24 

















and 


M,= 


Dff - AffD ff Aff (I + AffD F l F ) Mfc 


(14) 


(15) 


Mcf (l + D F pAff) 2Mcc ~ dia g{McfD ppMFclc)) 

Then Sc(M\,F ) is a Laplacian matrix, M 2 is a SDDM matrix, and 

Sc ( M , F) = ^ (Sc (Mi,F) + 5c ( M 2 , F)). 

Proof. Equation [15] follows immediately from equation [12] 

To prove that Sc (M 1 , F) is a Laplacian matrix, we observe that all of its off-diagonal entries 
are nonpositive, and that its row-sums are zero: 

Sc(Mi,F ) 1 c = diag(Mcf DppM fc1c)1c — dM cfD ff M fc^-c = 0 c- 

To prove that M 2 is a SDDM matrix, we observe that all of its off-diagonal entries are also 
nonpositive. For the FF block this follows from from the nonnegativity of Aff and Dff • For 
the FC and CF blocks it follows from the nonpositivity of Mcf and Mpc■ We now show that 

M 2 1 > Ml. 

This implies that M 2 is an SDDM matrix, as it implies that its row-sums are nonnegative and 
not exactly zero. 

We first analyze the row-sums in the rows in F. 


1 F 
1C 


(M 2 1)f = [ Dff — AffD f 1 f Aff ( I + AffD ff ) Mfc 

= Dff^-f + Mfc^-c — AffD F l F (Aff^f — dM fcAc) 

> Dff^-f + Mpc^-c ~~ AffD F l F D ff^-f 
= DpF^-F — ApF^-F T MFC^-C 
= (M1)f. 

Before, analyzing the row-sums for rows in C, we derive an inequality. As M is diagonally 
dominant, every entry of of D f 1 f (Aff^-f ~ dMpc^-c) is between 0 and 1. As Mpc is non¬ 
positive, this implies that 

MFcDf 1 p(AFF^-F — MfcIc) > MpC^-C- 
Using this inequality, we obtain 

(M 2 1)c = [ Mcf (I + D ff Aff ) 2Mcc — diag(M cfD ff Mfc^-c) 

= Mcf^-f + dM cfD f 1 f Aff^-f + 2 Mcc^-c ~ diag(M cfD ff M fc1c)1c 


i-F 

1 c 


= Mcf^F + M cfD ppApF^F + 2M(7clc — McfD f 1 f M FC^-C 
= (dMcc^-C + Mcf^-f) + Mcc^-c + McFDffp (Aff^f — dM fc^-c) 
> (Mcc^-C + Mcf^-f) + Mcc^-C + Mcf^-C 
= 2(M1) C . 


□ 
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We first discuss how to approximate the Schur complement of M \. 

Lemma 8.2. There is a procedure ApproxSchurDiag(-M, ( F , C), e) that takes a graph Lapla- 
cian matrix M with m non-zero entries, partition of variables ( F, C ) and returns a matrix 
matrix M sc such that: 

1. Msc has 0(me~ 4 ) non-zero entries, and 

2. Mgc M \ where M\ is defined in equation VTA 

Furthermore, the procedure takes in 0(me~ 4 ) work and O(logn) depth. 

The proof is based on the observation that this graph is a sum of product demand graphs, 
one per vertex in F. These demand graphs can be formally defined as: 

Definition 8.3. The product demand graph of a vector d, G(d), is a complete weighted graph 
whose weight between vertices i and j is given by 

RJ ij — d\dj. 

In Section [XJ we give a result on directly constructing approximations to these graphs that 
can be summarized as follows: 

Lemma 8.4. There is a routine WeightedExpander(cZ, e) such that for any demand vector 
d of length n and a parameter e, WeightedExpander((Z, e) returns in 0(ne~ 4 ) work and 
O(logn) depth a graph H with 0(ne~ 4 ) edges such that 

Lh L C ( d y 

Proof, (of Lemma 18.2p Since there are no edges between vertices in F , the resulting graph 
consists of one clique among the neighbors of each vertex u € F. Therefore it suffices to sparsify 
these separately. 

It can be checked that the weight between two neighbors v\ and V 2 in such a clique generated 
from vertex u is Wuv i Wuv 2 _ Therefore we can replace it with a weighted expander given in 
Lemma 18.41 above. □ 

Now, we can invoke Lemma 18.21 on M i to compute its Schur complement, which means it 
remains to iterate on M 2 . Of course, M 2 may be a dense matrix. Once again, we approximate 
it implicitly using weighted expanders. Here we also need weighted bipartite expanders: 

Definition 8.5. The bipartite product demand graph of two vectors d A , d B , G(d A , d B ), is a 
weighted bipartite graph whose weight between vertices i £ A and j € B is given by 

Wij = d A df. 

Lemma 8.6. There is a routine WeightedBipartiteExpander^' 4 , d B e) such that for any 
demand vectors d A and d B of total length n and a parameter e, it returns in 0(ne -4 ) work and 
O(logn) depth a graph FI with 0(ne^ 4 ) edges such that 

Lh L G ( d A jd By 
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Lemma 8.7. There exists a procedure SquareSparsify such that, SquareSparsify(M, (F, C), e) 
returns in 0(me -4 ) work and O(logn) depth a matrix M 2 with 0(me~ 4 ) non-zero entries such 
that M 2 M 2 , where M 2 is defined in equation \Tf\ 

Proof. The edges in this graph come from — Mpc, AffD^Aff and AffDj} f Mfc- The 
first is a subset, so we can keep them without increasing total size by a more than a constant 
factor. The later two consist of length two paths involving some u € F. Therefore we can once 
again sum together a set of expanders, one per each u € F. 

The edges in AffDJ^Aff correspond to one clique with product demands given by A uv 
for each u € F, and can be approximated using the weighted expander in Lemma 18.41 
The edges in AffD^Mfc can be broken down by midpoint into edges of weight 

A A 

UVp -f^-UVC 

du 

where vf € F, vq € C are neighbors of u. This is a bipartite demand graph, so we can replace 
it with the weighted bipartite expanders given in Lemma 18.61 

The total size of the expanders that we generate is 0{deg{u)e~ A ). Therefore the total graph 
size follows from YIugf deg(u) < m. □ 

In the next subsection, we shows how to handle the case that the M is a-diagonally dominant 
matrix with large a. Therefore, the number of iterations of splitting depends on how diagonally 
dominant is the matrix. Here we once again use the approach introduced in [PS 14] by showing 
that M 2 is more diagonally dominant than M by a constant factor. This implies 0(log(l/ae)) 
iterations suffices for obtaining a good approximation to the Schur complement. 


Lemma 8.8. If D — A is a-strongly diagonally dominant and A has Os on the diagonal, then 
D - AD 1 A is ((1 + a) 2 — 1 )-strongly diagonally dominant. 

Proof. Consider the sum of row i in AD 1 A, it is 

I Djj Ajk = ^2 \ Aij\ D-- ^2 \ Ajk\ < (1 + oi) ^2 \Aij\ 


where the inequality follows from applying the fact that D is 1 + a-strongly diagonally dominant 
to the j th row. The result then follows from yT ; \Aij\ < (1 + a)~ 1 Da. □ 

This notion is also stable under spectral sparsification. 

Lemma 8.9. If A = X + Y is a-strongly diagonally dominant, X is diagonal, Y is a graph 
Laplacian, and Y « e Y. Then A = X + Y is exp (—e) a-strongly diagonally dominant. 

Proof. Using Y Y, we have 

Y-i,i < exp(e) Yij. 

The fact that A is a-strongly diagonally dominant also gives X, t > aY j ; j. Combining these 
gives Xi t i > exp(— e)a Yij, which means X + Y is exp (—e) a-strongly diagonally dominant. □ 


27 






8.2 Schur Complement of Highly Strongly Diagonally Dominant Matrices 

It remains to show how to deal with the highly strongly diagonally dominant matrix at the last 
step. Directly replacing it with its diagonal, aka. SquareSparsify is problematic. Consider 
the case where F contains u and v with a weight e edge between them, and u and v are connected 
to u' and v' in C by weight 1 edges respectively. Keeping only the diagonal results in a Schur 
complement that disconnects v! and v'. This however can be fixed by taking a step of random 
walk within F. Given a SDDM matrix Mpp = Xpp + Lpp where Lpp is a graph Laplacian 
and Xff is a diagonal matrix. We will consider the linear operator 

zff = f \x~ l F + Ixfp ( Xpp - Lpp) Xp\ ( Xpp - Lpp) Xp x p. ( 16 ) 

Lemma 8.10. If M pp = X pp + Lpp be a SDDM matrix that’s a-strongly diagonally dominant 
for some a > 4, then the operator Z^ last ^ as defined in Equation 1 1 61 satisfies: 

Mpp A (z^p 1 ^ A Mpp + —Lpp. 

— 1/2 — 1/2 — 1/2 
Proof. Composing both sides by X FF and substituting in Cpp = X FF LppX FF means it 

suffices to show 

I + Cff < (-1 


- (/ - C F f)‘ 


-l 


P I + Cff H— Cff- 
a 


The fact that Mpp is a-strongly diagonally dominant gives 0 P Lpp P j^Xpp, or 0 P Cff P 
(Lemma 15.31) . As Cff and I commute, the spectral theorem means it suffices to show this 
for any scalar 0 < t < —. Note that 

j — — a 

11/ 1 9 

- + - i - 1Y = i -1 + -t 2 
2 2 y J 2 

Taking the difference between the inverse of this and the ‘true’ value of 1 + t gives: 


1 -t + -P 


-l 


— (1 Ft) — 


l-(l + t)(l-t + lf 2 ) 


1 -t 


-t 2 


1 — t+ \t 2 


Incorporating the assumption that 0 < t < — and a > 4 gives that the denominator is at least 


2 1 
- > - 

a ~ 2 


and the numerator term can be bounded by 


t 2 . . t 

0 <- 1 

2 a 


Combining these two bounds then gives the result. 


□ 
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(17) 


To utilize note that the Schur complement of the matrix 


Af (*"*) d = 


M fc 


Mcf 

M C c . 


equals to the average of the Schur complements of the matrices 


M 


(last) def 


XpF M FC 

Mcf diag (McfX-^MfcIc) 


and 


M 


(last) def 


X 


FF 


(X 


FF 


- Lff) X ppMpc 


(18) 


(19) 


MqfX ff (Xff — Lff) 2,Mcc — diag ( MqfX f 1 f M fc) ■ 

The first term is SDDM by construction of its CC portion We can verify that the second 
term is also SDDM in a way that’s similar to Lemma 18.11 


Lemma 8.11. Let M be a SDDM matrix, and let ( F , C) be an arbitrary partition of its columns. 
Suppose that Mff is a-strongly diagonally dominant for some a > 4. Define the matrices 
2 (last), jyj-(iast), jy^dast) an( i ][j^ ast ) as EauationslTR [79] 1771 and fT9l Then, Sc(M^ ast \F) 

is a Laplacian matrix, m)^ 1 ^ is a SDDM matrix, and 

Sc (^M^ last \ F^j = ^ (Sc (^M^ ast \ f') + Sc (M^ ast) ,F)) . (20) 

Proof. Equation [20] follows from substituting Equation [16] into Equations (TT] [18] and [T9] 

To prove that Sc (^M^ ast \ f) is a Laplacian matrix, we observe that all of its off-diagonal 
entries are nonpositive, and that its row-sums are zero: 

5c (M ( / ast) ,E) l c = t>iag(M C fX ff M FC lc)lc ~ McfX f 1 f M fc 1c = 0 C . 

To prove that M i s a SDDM matrix, we observe that all of its off-diagonal entries are 
also nonpositive. For the FF block this follows from from the nonnegativity of Xff- For the 
FC and CF blocks it follows from the nonpositivity of Mcf and M pc and the fact that off- 
diagonal entries of Lff are nonpositive, the diagonal of Lff being bounded by 2/aX ff, and 
a > 2. For the CC block, it follows from the fact that 

2 Mcc ~ McpXfpMpc h %Mcc — 2 McfM ff Mpc h 0. 


We now show that 

M^ ast) 1 > 0. 

This implies that M^’* 1 ' 1 is an SDDM matrix, as it implies that its row-sums are nonnegative 
and not exactly zero. 

We first analyze the row-sums in the rows in F: 


= [ Xff (Xff — Lff) X ff Mfc 


1 F 

lc 


= Xpplp + (Xpp - Lff) X F l F Mpclc 
> XffIf — (Xff — Lpp) X FF XffIf 


= 0 , 
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where we used the fact Mfc^-C = — Lpplp > — Xff^-f in the inequality. 
For the row-sum in the rows in C , we obtain 


(Mj^lJc = [ McfX pp (Xff ~ Lff) 2Mcc — diag (McfX ppMpc) 

= Mcf^-f + 2Mcclc — diag(M cF^ppM fc^c)^c 

> Mcc^-C - McfX^pMfcIc 

> Mcc^-c + McfXJ^XffIf 

= Ml > 0 . 


If 

lc 


□ 


L schur [c\ = LastStep (Af, (F, C) , e) 

1. Form M^ abt ' > as in Equation [151 

2. Form as in Equation [TUI . 

3. M%g st) <- SquareSparsify (m % ast) , (F, C ), e/ 2 ) . 

4. Msc | ApproxSchurDiag (^Mf ast \e/2j + ^ApproxSchurDiag (^M^g St \e/ 2 ). 

5. Return Msc- 

Figure 4: Pseudocode for approximating a highly strongly diagonally dominant matrix. Small 
modifications on ApproxSchurDiag and SquareSparsify is required to handle this case. 


Lemma 8.12. Let M be a SDDM matrix, and let (F, C ) be an arbitrary partition of its columns. 
Suppose that Mff is a-strongly diagonally dominant for some a > 4. There exists a procedure 
LastStep such that, LastStep (M ,(F,C),e) returns in 0{me~ 8 ) work and O(logn) depth a 
matrix Msc with 0(me~ 8 ) non-zero entries such that Msc ~ e + 2 /a Sc(M, F). 

Proof. We remark that Lemma 18.21 is designed to compute Schur complement of the matrix fl3l 
and Lemma 18.71 is designed to sparsify the matrix the matrix [TH However, it is easy to mod¬ 
ify them to work for computing the Schur complement of the matrix [18] and sparsifying the 
matrix m 

By Lemma [8TT1 we know that SquareSparsify takes 0(me~ 4 ) work and O(logra) depth 
and outputs the matrix Afjcj with 0(me~ 4 ) non-zero entries. Therefore, Lemma 18.21 shows 
that ApproxSchurDiag takes 0(me~ 8 ) work and O(logn) depth and outputs a matrix with 
0(me~ 8 ) non-zero entries. This proves the running time and the output size 
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For the approximation guarantee, Lemmas 18.101 [8TTI and 18.21 give: 

Msc -2/a \ (sc (m^, f) + 5c (m^V)) 

» e/2 ^ (Sc(Mf Mt) ,F) +5c(Mg st) ,F)) 

~e/2 -APPROxSCHURDlAG (^M^ aSt \ e/2^ + -APPROxSCHURDiAG (^M^g St \e/2 S j 

= M sc - 

□ 

8.3 Summary 

Combining the splitting step and the final step gives our algorithm (Figure 0. 


L S chur[C] = ApproxSchur (M, (F, C), a, e) 

1. Initialize M sc ^ 0, <— M, d = log 1+Q! (l3e _1 ) 

2. For i from 1 to d do 

(a) Form ^ as in Equation 1131 

(b) Form M^' ^ as in Equation 1141 . 

(c) M S c <- M S c + ^ApproxSchurDiag ^M^~ l \ ^ 

(d) AfW «- ^SquareSparsify (F, C) 

3. M S c t- Msc + LastStep f-'j- 

4. Return Msc- 


Figure 5: Pseudocode for Computing Spectral Vertex Sparsifiers 

Theorem 8.13. Suppose that M is a-strongly diagonally dominant and 0 < e < 1, then 
ApproxSchur returns a matrix Msc with O (m (e -1 log Q (e” 1 )) 0 ^ 080 ^ 6 non-zeros such 
that 

Msc -e Sc(M, F). 

in O (jn (e^ 1 log Q (e -1 )) 0 ^ loSa ^ ^ work and O (log Q (e _1 ) log(n)) depth. 

~ (i) 

Proof. Let M S c denote the Msc after i steps of the main loop in ApproxSchur We will show 
by induction that at each i, 

Sc (M, F) mS; + 5c(m (8) ,f) . 
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The base case of i = 0 clearly holds. For the inductive case, suppose we have the result for some 
i, then 

Sc (M, F) M% + ^ (ScfMjV) +Sc(M%\ 

Lemma 18.21 gives 


M 


(i+l) 

SC 


r(0 


1 


= M''g C + ^ApproxSchurDiag (M^, (F, (7), 


3d 


M 


(0 


1 


™sc + §Sc(*fi i, .F), (21) 


while Lemma 18.71 gives 


M^ +1 ) 


3d 2 


-M W 


2 > 


which combined with the preservation of Loewner ordering from Fact 14.41 gives 

S C [M^ i+1 \F) ^5c(M«, 

Combining these two bounds (T2Tj) and 021) then gives: 


( 22 ) 


M% + 1(Sc(m?,f)+Sc (M«, F 


3d 


M 


(i+l) 

SC 


+ Sc (M^ i+1 \F) . 


Hence, the inductive hypothesis holds for i + 1 as well. 

By Lemmas 18.81 and 18.91 we have that M FF is 12e _1 -strongly diagonally dominant at the 
last step. Lemma 18.121 then gives 


M 


(d) 


LastStep (M 


rM 
1 > 


12 


Composing this bound with the guarantees of the iterations then gives the bound on overall error. 
The work of these steps, and the size of the output graph follow from Lemma 18.21 and 18.71 □ 

In our invocations to this routine, both a and e will be set to constants. As a result, this 
procedure is theoretically 0{m ) time. For a spectral vertex sparsification algorithm for handling 
general graph Laplacians, a can be 0 and we need to invoke spectral sparsifiers to L t after each 
step. Any parallel algorithm for spectral sparsification (e.g. [STlll ISS111 IOV11, !Koul4] will 
then lead to nearly linear work and polylog depth. 

Corollary 8.14. Given a SDDM matrix with condition number k, a partition of the vertices into 
(F. C), and error e > 0, we can compute in O (^rnlog°^ 1 \nKe~ 1 )^ work and O ^log^^nKe^ 1 )^ 

depth a matrix Msc with O (nlog°^ ne~ 2 ) non-zeros such that 


Msc -e Sc(M, F). 

Proof. We can add to each element on the diagonal to obtain M r ~ e M. Therefore it 

suffices to assume that Mpp is poiy 1 ^)^ -strongly diagonally dominant. 

Therefore Theorem 18. 131 gives that ApproxSchur terminates in d = 0(log K + logn) steps. 
If we invoke a spectral sparsification algorithm at each step, the number of non-zeros in each 
AfW can be bounded by 0(nlog 0 ^ n(e/d ) 2 ) = 0{n\og°^ l \nse 1 )). The overall work bound 
then follows from combining this with the poly(e -1 <i) increase in edge count at each step, and 
the nearly-linear work guarantees of spectral sparsification algorithms. □ 
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We remark that the setting of e t = 1/ log k leads to a fairly large number of log factors. In the 
rest of this paper we only invoke spectral vertex sparsifiers with moderate values of (unless 
we’re at graphs that are smaller by poly(n) factors). Also, we believe recent developments 
in faster combinatorial spectral sparsification algorithms |Koul4] make faster algorithms for 
spectral vertex sparsifiers a question beyond the scope of this paper. 

9 Algorithmic Constructions 

In this section, we gives two algorithms to compute vertex sparsifier chains, the first algorithm 
uses existing spectral sparsifier for graphs and the second algorithm does not. Although com¬ 
bining two approaches gives a better theoretical result, we do not show it because we believe 
there will be better spectral sparsifier algorithms for graphs soon and hybrid approaches may 
not be useful then. 

9.1 Black Box Construction 

The first construction relies on existing parallel spectral sparsifer algorithms. For concreteness, 
we use the parallel spectral graph sparsification algorithm given by Koutis |Koul4j . 

Theorem 9.1. Given any SDD matrix M with n variables and m non-zeros, there is an al¬ 
gorithm BlackBoxSparsify(M, e) outputs a SDD matrix B with 0(n log 3 n/e 2 ) non-zeros 
such that M B in 0(log 3 nloga/e 2 ) depth and 0((m + n log 3 n/e 2 ) log 2 n/e 2 ) work where 

n = rr , L 

nlog^n/e 2 ' 


(M (1 ), M (2 \ • • • ; Fi,F 2 , ■ ■ ■) = BlackBoxConstruct(m(°)) 

1. Let k = 1, M 4 — M ^ and Fq be the set of all variables. 

2. While M (k> has more than 100 variables 

(a) M 4- BlackBoxSparsify(M^\ l/(klog 2 (k + 4))). 

(b) Find a subset Fk of size such that Mp^ Fk is 4-strongly diagonally dominant. 

(c) M (fc+1 ) 4 - ApproxSchur(M (A: ), (F k ,F k _ i \ F k ), 4, l/(klog 2 (k + 4))). 

(d) k 4 - k + 1 . 


Figure 6: Pseudocode for Constructing Vertex Sparsifier Chains Using Existing Spectral Spar¬ 
sifiers 

In the k th step of the algorithm, we sparsify the graph and compute an approximate Schur 
complement to l/(k log 2 (k + 1)) accuracy and this makes sure the cumulative error is upper 
bounded by J^'kLi l/(klog 2 (k + 1)) which is a constant. 
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Theorem 9.2. Given any SDD matrix Mwith n variables and m non-zeros, the algorithm 
BlackBoxConstruct(M^) returns a vertex sparsifier chain such that the linear operator 
W corresponding to it satisfies 

W + « 0(1) M (0) . 

Also, we can evaluate Wb in O (log 2 (to) log log to) depth and 0(n log 3 n log log to) work for any 
vector b. 

Furthermore, the algorithm BlackBoxConstruct(M (0 1) runs in 0(log 6 nlog 4 logn) depth 
and 0(m log 2 n + n log 5 to) work. 

Proof. Let and m^ be the number of vertices and non zero entries in matrix M^ k \ Let 
s(n) = nlog 3 n which is the output size of BlackBoxSparsify and e(k) = l/(fclog 2 (fc + 4)) 
which is the accuracy of the k- th sparsification and approximate schur complement. 

We first prove the correctness of the algorithm. The ending condition ensures M^ ast ’ has 
size 0(1); step (2a) and (2c) ensures ~ 2 e(fc) SC(M ( - k \Fk) and step (26) ensures M F ] k F k 

is 4 strongly diagonally dominant. Therefore, the chain ■ ■ ■ ; F\, ■ ■ ■) is a vertex sparsifier 

chain. Since the emulative error = 0(1), Lemma 15.81 shows that the resultant operator 

W satisfies 

W t « 0(1) m(°). 

Now, we upper bound the cost of evaluating Wb. Lemma ED shows that \Fk\ = 0(to^) and 
hence a constant portion of variables is eliminated each iteration. Therefore, to ^ < c k ~ l n for 
some c. Using this, Lemma 15.81 shows the depth for evaluating Wb is 

O(logn) 

o( E log(fc)log(n)) = 0(log 2 (n) log log n) 

k=1 

and the work for evaluating Wb is 

O(logn) 

0( E log (k)s(c k ~ l n)/e{k) 2 ). 
k= 1 

Using s(to) = nlog 3 to and e(k) = l/(klog 2 (k + 4)), the work for evaluating is simply 0(s(n)). 

For the work and depth of the construction, Lemma [5.21 shows that it takes 0(m work and 
0(lognl fc l) depth to find F\ and Theorem [8713] shows that ApproxSchur takes 0(m^ k 0 ^ logk ^) 
work and 0(loglog k) depth. Using < c k ~ 1 n and mW = s(n^)/e(i) 2 , the total work 
for this algorithm excluding BlackBoxSparsify is 

0(log n) 

Y 0(s(c k ~ 1 n)k o( ' logk ' ) /e(k) 2 ) = 0(s(to)). 
k =1 

Hence, the total work for BlackBoxConstruct is 

O(logn) 

0(s(n)) + 0(m log 2 to) + 'Y 0(s(n < ' k ' > )k o< ' losk ^ log 2 /e{k) 2 ). 

k =2 

Using s(n^) is geometric decreasing, the total work is 0(m log 2 to + to log 5 to). We can bound 
the total depth similarly. □ 
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Remark 9.3. Given an sparsifier algorithm that takes d(m,n ) depth and w(m,n)/e 2 work to 
find a sparsifer of size s(n)/e 2 , the BlackBoxConstruct roughly takes 0(log 2 nloglogn) + 
0(d(m,n)\ogn) depth and 0(w(m,n )) work to construct a vertex sparsifier chain and such 
chain has total depth O (log 2 n log log n) and total work 0(s(n)). 

Therefore, the work for preprocessing is roughly linear to the work needed to sparsify and 
the work for solving is linear to the size of sparsifier. Hence, solving Laplacian system is nearly 
as simple as computing sparsifier. 


9.2 Recursive Construction 


We now give a recursive construction based on the idea that solvers can be used to compute 
sampling probabilities jssn]. We will describe the construction in phases, each containing 
r iterations. Each iteration decreases the number of vertices while maintaining the density of 
graph. We maintain the density by the general sparsification technique introduced by [CL M + 14] 
as follows: 

Lemma 9.4 i |( 'I.M 11;. Given M be a class of positive definite n x n matrices. Let M. (m) 
be the set of all B ‘ B £ AA. such that B has m rows. Assume that 

1. For any B 1 B £ A4 and non negative diagonal matrix D. we have B 1 DB £ A4. 

2. For any matrix B 1 B € M., we can check if every row b is in im(B r ) or not in depth 
dchkijn ) and work w chk (m). 

3. For any B 1 B £ A4(m), we can find an implicit representation of a matrix W such that 
W {B t B) t in depth d con (m,n ) and work w con (m,n ) and for any vector b, we can 
evaluate Wb in depth d eva i(m,n ) and work w eva i(m,n). 

For any k > 1, 1 > e > 0 and matrix B 1 B £ M.(rn), the algorithm Sparsify (B 1 B ,k,e) 
outputs an explicit matrix C 1 C £ M. (0(kn log n/e 2 )) with C T C B 1 B. 

Also, this algorithm runs ind con (f^,n)+0{d eva i(m,n)+d c hk{rn)+^ogn) depth andw con (^,n) + 
O ( w evai (m , n) log n + w chk (m) + m log n) work. 

Each call of spectral vertex sparsication increases edge density, but the Sparsify routine 
allows us to reduce the density at a much faster rate. A higher reduction parameter r in the 
algorithm RecursiveConstructy allows us to reduce cost of these recursive sparsication steps. 

The following lemma proves that the algorithm RecursiveConstructv produces a vertex 
sparsifier chain and the linear operator corresponding to the vertex sparsifier can be evaluated 
efficiently. 


Lemma 9.5. Given a large enough constant r. There are universal constants 0 < c\ < 1 and 

C2 > 0 such that for any SDD matrix M ^ with n variables, the algorithm RECURSiVECONSTRUCT r (M^) 

returns a vertex sparsifier chain (M ^, • • • ; F\, F 2 , ■ ■ ■) satisfying the following conditions 


1. 

2 . 


For all k> 1, n^ < c\ l n where n^ are the number of variables in M*' 1 '' 1 . 

Except step 1, at any moment, all intermediate matrices M appears at the k th iteration 
has density 

m < 2 3c 2 rl °g 2fc 
n ’ log n’ — 
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(M^\ M (2 \ • • • ; Fi,F 2 , ■ ■ ■) = Recursive Construct,. (iVf ^) 

1. M t 1 ) «— Sparsify(M^°\ 2 C2r , 1/4), k <— 1 and Fo be the set of all variables. 

2. While M ^ has more than 0(l) r vertices, 

(a) Find a subset F^, of size such that Mp k F k is 4-strongly diagonally dominant. 

(b) M (fc+1 ) v- ApPROxSCHUR(M (fc i, (F k ,F k _i \ F k ), 4, (A: + 8) -2 ). 

(c) If k + 1 mod r = 0, Then 

i. M (fc+1) <- SPARSlFY(M (fc+1) , (jfe + 9)" 2 , 2 2c 2 H °g 2 ( fc+1 )). 

(d) jfe «- k + 1. 


Figure 7: Pseudocode for Recursively Constructing Vertex Sparsifier Chains 


for k > 1 where m' and n' are the number of non-zeros and variables of M. 

(k) 

3. For all k > 1, M FkFk is A-strongly diagonally dominant, 

l For all k > 1, M( fc+1 ) « 2(fc+8) - 2 Sc(M( fc ),F fc ). 

Furthermore, the linear operator W corresponding to the vertex sparsifier chain satisfies 

W «! (M(°)) f . 

Also, we can evaluate Wb in O (log 2 n log log n) depth and 2°( r ’ log2r )nlogn work for any vector 

b. 


Proof. For the assertion (1), we note that the step (2a) ensures \F k \ = 12(ni fc i) and hence a 
constant portion of variables is eliminated each iteration. This proves n^ < c k ~ 1 n for some c. 
For the assertion (2), Theorem 18.131 shows that after the approximate Schur complement 

m (k+i) = 0(m( k \k 2 log(A; + 8))°i log i fc+8 ii) 

< 2°( log2(fc+1) )m (fc) . 

Hence, it shows that each iteration the density increases by at most 2 C2log ( fc+1 ) for some constant 
c 2 . After the Sparsify step in (2ci), we have 

m (sr) 

m _ < 2 2c 2 rl °g 2 ( sr ) 

n Ml ogn M “ 

Then, after r iterations of ApproxSchur and before the sparsification of s+1 ) r '), we have 

_ ^ ^ 2 2 c 2 rl °g 2 ( sr ) 9 c 2 l°g 2 (s 7 ’+l) nc 2 Iog 2 ((s+l)r) 

n ((H-l)r) logn (( S +l)r) - 

< 2 3c 2 rl °g 2 (( s + 1 ) r ) 
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This proves the assertion (2). 

For the assertion (3), it follows from the construction of T). in step (2a). 

For the assertion (4), we note that in step (2b), we construct the approximate Schur com¬ 
plement M^ k+1 ^ such that M^ k+1 ^ «( fc+8 )-2 Sc F^j. Therefore, we only need to check 

M( sr ' > for all s because M( sr ' > is modified at step (2ci) after the sparsification. Note that 
Lemma 19.41 guarantee that M tk> changes only by (k + 8)~ 2 factor. Hence, in total, we have 
M^ k+1) ~ 2(fc+8) - 2 

For the last claim, Lemma 15.81 shows that 

w « i / 2 + 4 E ,(*«)- 2 ( M<0, ) t 

and we can evaluate Wb in 

Q('Y j log k log n (k ^) = 0(log 2 n log log n) 
k 

depth and 

0(^2 2 3c2rlog2 k n^ log log A:) = 2°( rlog2r )nlogn 

k 

work. □ 

In the algorithm RecursiveConstructy, we call the (sr + l) th to the ((s + 1 )r) th iteration 
as the s th phase. At the end of each phase, the Sparsify is called once. The previous lemma 
showed that the density of the graph at the k th iteration is less than 2 3c2rlog k . This explains 
our choice of reduction factor 2 2c2rlog k in the Sparsify algorithm as follows: 

Lemma 9.6. Letn^ is the number of variables of M^ k \ From the (sr + l) th to the ((s + l)r) t/l 
iteration including the Sparsify call at the end, the algorithm takes 

20 (rlog 2 (sr)) n (sr+l) j og 2 n (sr+l) 

work and 

0(r log(sr) log 2 n^) + 0(log 2 n^ r+1 ^ log log n^ r+1 ^) 

depth and the time to construct the vertex sparsifier chain for a SDD matrix with 77 ,(( s + 1 ) r ') 
variables and 2 C2rl °s 2 (( s + 1 ) r ) re (( s + 1 ) r ) log ji(( s + 1 ) r ') non zeros. 

Proof. Let m ^ and n ^ be the number of non zeros and variables in M^ before the Sparsify 
call if there is. Lemma 15.21 and Theorem 18. 131 shows that the depth and work of the k th iteration 
takes 0(m^ +m^ fc+1 ^) work and 0(log k log n^) depth. Lemma 19.51 shows that 

nW < c k ~ 1 n and m (fc) < 2 3c2rlog2 fc n (fc) log n (fc) 

and hence, from the (sr + l) th to the ((s + 1 )r) th iteration (excluding the Sparsify call at the 
end), the algorithm takes 

Y O + m (fc+1) ) < Y 2°( rlog2fc )n (fc) logn (fc) 

k=sr+l k=sr -\-1 

< 2°( rlog2 ( sr ^77,^ r+1 hogn^ r+1 ^ 
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work and 


(s+l)r 

O ^log Hogn^-^ < 0{r log(sr) logn^) 

k=sr -\-1 


depth. 

Now, we bound the cost of the Sparsify call. Let m* and n* be the the number of 
non zeros and variables in j\ < f(( s + 1 ) r ) before the SPARSIFY call. Lemma 19.41 shows that the 
Sparsify call takes d CO n (m*2~ 2c2rlos (( s + 1 ) r ) ) n *^j + 0(d eva i(m* , n*) + d c hkijn*) + logn*) depth 

and w con ^rn*2~ 2c2rlog2 ^ s+1 ' )r \n*^j + 0(w eva i(m*,n*) logn* + w c hkijn*) + m* logn*) work. 

For any SDD matrix B 1 B , an edge b G im(B T ) if and only if the end points of the edge 
is in the same connected component of the graph corresponding to B J B. Halperin and Zwick 
(HZ96| shows how to compute the connected components of a graph with m edges and n vertices 
in O(logn) depth and 0(m + n) work for the EREW PRAM model. Using this, we can check 
every edge in O(logn) depth and 0{m + n) work. 

To construct an implicit approximate inverse for the sampled SDD matrix, we can use 
RecursiveConstructy. Lemma 19.51 showed that it takes O (log 2 n* log logn*) depth and 
2°(r ^g 2 r )n*log?7* work to apply the approximate inverse once. 

Hence, the total running time from the (sr + l) th to the ((s + 1 )r) th iteration including the 
Sparsify call is the time to construct the vertex sparsifier chain plus 

20(rIog 2 (sr)) n (sr+l) j Q g 2 n O+l) 


extra work and 

0(r log(sr) log n ( ' sr ' > ) + 0( log 2 n^ r+1) log log n^ r+1 ^) 
extra depth. □ 

Note that at the end of the s th phase, the time required to construct an extra vertex sparsifier 
chain for the Sparsify call is less than the remaining cost after the s th phase. This is the reason 
why we use 2 2c2rlog k as the reduction factor for the Sparsify call. The following theorem takes 
account for the recursive call and show the total running time for the algorithm. 

Lemma 9.7. With high probability, the algorithm RecursiveConstruct^M^) returns a 
vertex sparsifier chain such that the linear operator W corresponding to it satisfies 

w «i (m ( °))\ 

Assume r log 2 r = o(logn), we can evaluate Wb in O (log 2 n log logn) depth and 2°( rlog2 ^nlogn 
work. Also, the algorithm RecursiveConstructv takes 2°( logri / r ) depth and m logn + 

2°( rlo s 2 r ) n \og n work. 

Proof. All result is proved in lemma 1931 except the construction time. 

To bound the construction time, we first consider the case M ^ has only 2 C2r nlogn non¬ 
zeros. In that case, the algorithm skips step 1 because the matrix is already sparse. Lemma 19.61 
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shows that during the s th phase, the Sparsify call requires us to construct an extra vertex spar- 
sifier chain for a matrix with n(( s+1 ) r ) variables and at most 2 C2rlog2 (( s+1 ) r )n(( s+1 ) r ) log ? t,(( s + 1 ) 7 ’) 
non-zeros. Also, we know that the Sparsify returns a matrix with n(( s+1 ) r ) variables and 
22c 2 riog ((s+i)r) n ((s+i)r) j Q g n ((s+i)r) non _ zero Hence, the cost of remaining iteration (excluding 
the recursion created afterward) is larger than the cost to construct the extra vertex sparsifier 
chain required at the s th phase. 

Hence, considering this recursion factor, the running time of the s th phase is multiplied by 
a factor of 2 s . 

Since there are 0(\ogn/r ) phases and r log 2 r = o(logn), the total depth of the algorithm is 

0 (logn/r) 

2 s (r log(sr) log 2 rS sr ^ + log 2 vS sr ^ log log vS sr ^ 

S =1 

= 2°( log n / r ) O (r log log n log 2 n^ last) + log 2 n^ last) log log n ^ last) ) 

— 2 0 ( lo g n / r )o (r 2 log logn + r 2 log r) 

= 2 ° (logri/? Vloglog(n) 

— 2°( logn / r ' > 


and the total work of the algorithm is 


0 (logn/r) 


^2 2 s ( 2 ° ( ' 


rlog 2 (sr)) re (sr+l) lno .2 „0+l) 


log TV 


S= 1 


= 2°( rlog2r )nlog 2 n. 


For general m, during the first step, Sparsify, we need to solve a certain SDD matrix 
with at most m^2~ C2r non-zeros and variables. To solve that SDD matrix, we use 
RecursiveConstructy to construct a vertex sparsifier chain and use the chain to solve that 
0(log(n)) different right hand sides. Using rlog 2 r = o(logn), the total depth for this algorithm 
is 

m 


O l0g 2 r- 


^nlogn, 

and the total work of the algorithm is 


20(logn/r) \ _ ) 20(logn/r) _ 20(Iogn/r). 


mlogn + 2°( rlog r ^n log 2 n log 2 
= mlogn + 2°( rlog r - ) n log 2 n log ^ 


m 


nlogn 
m 


nlogn 

Note that the first term dominate if ^ > 2°( rlog2r ) and hence we can simplify the term to 

mlogn + 2°( rlog r ^nlog 2 n. 


□ 


The following theorem follows from Lemma 19.71 bv setting r = log log logn. 
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Theorem 9.8. Given any SDD matrix M with n variables and m non-zeros. We can find an 
implicit block-Cholesky factorization for the matrix M in 0(m log n + nlog 2+ °^ n) work and 
0(ri c '( 1 )) depth such that for any vector b, we can compute an e approximation solution to M~ 1 b 
in 0((m + n log 1+ °^ to) log(l/e)) work and 0(log 2 n log log nlog(l/e)) depth. 

Acknowledgements 

We thank Michael Cohen for notifying us of several issues in previous versions of this manuscript. 


References 


[AZL015] 

[BGH+06] 

[BHV08] 

[Bra77] 

[BSS12] 

[CKM+11] 

[CKM+14] 

[CLM+14] 

[DS08] 


Zeyuan Allen-Zhu, Zhenyu Liao, and Lorenzo Orecchia. Spectral sparsification and 
regret minimization beyond matrix multiplicative updates. In Proceedings of the 
Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC ’15, 
pages 237-245, New York, NY, USA, 2015. ACM. 

M. Bern, J. Gilbert, B. Hendrickson, N. Nguyen, and S. Toledo. Support-graph 
preconditioners. SIAM J. Matrix Anal. & Appl, 27(4):930-951, 2006. 

Erik G. Bornan, Bruce Hendrickson, and Stephen A. Vavasis. Solving elliptic fi¬ 
nite element systems in near-linear time with support preconditioners. SIAM J. 
Numerical Analysis, 46(6):3264-3284, 2008. 

Achi Brandt. Multi-level adaptive solutions to boundary-value problems. Mathe¬ 
matics of computation, 31(138):333-390, 1977. 

Joshua Batson, Daniel A Spielman, and Nikhil Srivastava. Twice-Ramanujan spar- 
sifiers. SIAM Journal on Computing, 41(6):1704-1721, 2012. 

Paul Christiano, Jonathan A. Kelner, Aleksander Madry, Daniel A. Spielman, and 
Shang-Hua Teng. Electrical flows, laplacian systems, and faster approximation of 
maximum flow in undirected graphs. In Proceedings of the f3rd annual ACM sym¬ 
posium on Theory of computing, STOC ’ll, pages 273-282, New York, NY, USA, 
2011. ACM. 

Michael B. Cohen, Rasmus Kyng, Gary L. Miller, Jakub W. Pachocki, Richard Peng, 
Anup B. Rao, and Shen Chen Xu. Solving sdd linear systems in nearly mlogl/2n 
time. In Proceedings of the f6th Annual ACM Symposium on Theory of Computing, 
STOC T4, pages 343-352, New York, NY, USA, 2014. ACM. 

Michael B Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, 
and Aaron Sidford. Uniform sampling for matrix approximation. arXiv preprint 
arXiv:1408.5099, 2014. 

Samuel I Daitch and Daniel A Spielman. Faster approximate lossy generalized flow 
via interior point algorithms. In Proceedings of the fOth annual ACM symposium on 
Theory of computing, pages 451-460. ACM, 2008. 


40 


[Fed64] 

[Hac82] 

[Hac85] 

[HZ96] 

[KFS13] 

[KMP12] 

[K0SZ13] 

[Koul4] 

[LM10] 

[LPS88] 

[LRS13] 

[LS13] 
[Mad 13] 


Radii Petrovich Fedorenko. The speed of convergence of one iterative process. USSR 
Computational Mathematics and Mathematical Physics, 4(3):227-235, 1964. 

Wolfgang Hackbusch. Multi-grid convergence theory. Springer, 1982. 

Wolfgang Hackbusch. Multi-grid methods and applications, volume 4. Springer- 
Verlag Berlin, 1985. 

Shay Halperin and Uri Zwick. An optimal randomised logarithmic time connectivity 
algorithm for the erew pram. Journal of Computer and System Sciences, 53(3) :395- 
416, 1996. 

Dilip Krishnan, Raanan Fattal, and Richard Szeliski. Efficient preconditioning of 
laplacian matrices for computer graphics. ACM Transactions on Graphics (TOG), 
32(4):142, 2013. 

Jonathan A. Kelner, Gary L. Miller, and Richard Peng. Faster approximate multi- 
commodity flow using quadratically coupled flows. In Proceedings of the ffth sym¬ 
posium on Theory of Computing, STOC T2, pages 1-18, New York, NY, USA, 2012. 
ACM. 

Jonathan A Kelner, Lorenzo Orecchia, Aaron Sidford, and Zeyuan Allen Zhu. A 
simple, combinatorial algorithm for solving sdd systems in nearly-linear time. In 
Proceedings of the 45th annual ACM symposium on Symposium on theory of com¬ 
puting, pages 911-920. ACM, 2013. 

Ioannis Koutis. Simple parallel and distributed algorithms for spectral graph spar- 
sification. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms 
and Architectures, SPAA T4, pages 61-66, New York, NY, USA, 2014. ACM. 

Frank Thomson Leighton and Ankur Moitra. Extensions and limits to vertex spar- 
sification. In Proceedings of the 42nd ACM Symposium on Theory of Computing, 
STOC 2010, Cambridge, Massachusetts, USA, 5-8 June 2010, pages 47-56, 2010. 

A. Lubotzky, R. Phillips, and P. Sarnak. Ramanujan graphs. Combinatorica, 
8(3) :261—277, 1988. 

Yin Tat Lee, Satish Rao, and Nikhil Srivastava. A new approach to computing maxi¬ 
mum flows using electrical flows. In Proceedings of the 45th annual ACM symposium 
on Symposium on theory of computing, STOC ’13, pages 755-764, New York, NY, 
USA, 2013. ACM. 

Yin Tat Lee and Aaron Sidford. Path finding ii: An\~ o (m sqrt (n)) algorithm for 
the minimum cost flow problem. arXiv preprint arXiv:1312.6713, 2013. 

Aleksander Madry. Navigating central path with electrical flows: From flows to 
matchings, and back. In 54th Annual IEEE Symposium on Foundations of Computer 
Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pages 253-262, 2013. 


41 


[Mar88] 

[Moil3] 

[MP13] 

[MSS15] 

[MV77] 

[Nic78] 

[NN12] 

[0V11] 

[PS14] 

[SS11] 

[ST11] 

[ST14] 

[Tch36] 

[Vai90] 


G. A. Margulis. Explicit group theoretical constructions of combinatorial schemes 
and their application to the design of expanders and concentrators. Problems of 
Information Transmission, 24(l):39-46, July 1988. 

Ankur Moitra. Vertex sparsification and oblivious reductions. SIAM J. Comput., 
42(6):2400-2423, 2013. 

Gary L. Miller and Richard Peng. Approximate maximum flow on separable undi¬ 
rected graphs. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium 
on Discrete Algorithms, pages 1151-1170. SIAM, 2013. 

Adam W Marcus, Nikhil Srivastava, and Daniel A Spielman. Interlacing families IV: 
Bipartite Ramanujan graphs of all sizes. arXiv preprint arXiv:1505.08010, 2015. to 
appear in FOCS 2015. 

J. A. Meijerink and H. A. van der Vorst. An iterative solution method for linear 
systems of which the coefficient matrix is a symmetric m- matrix. Mathematics of 
Computation , 31(137): 148—162, 1977. 

RA Nicolaides. On multigrid convergence in the indefinite case. Mathematics of 
Computation , pages 1082-1086, 1978. 

Artern Napov and Yvan Notay. An algebraic multigrid method with guaranteed 
convergence rate. SIAM journal on scientific computing, 34(2):A1079-A1109, 2012. 

Lorenzo Orecchia and Nisheeth K. Vishnoi. Towards an sdp-based approach to 
spectral methods: a nearly-linear-time algorithm for graph partitioning and decom¬ 
position. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on 
Discrete Algorithms, SODA ’ll, pages 532-545. SIAM, 2011. 

Richard Peng and Daniel A. Spielman. An efficient parallel solver for SDD linear 
systems. In Symposium on Theory of Computing, STOC 2014, New York, NY, USA, 
May 31 - June 03, 2014, pages 333-342, 2014. 

D. Spielman and N. Srivastava. Graph sparsification by effective resistances. SIAM 
Journal on Computing, 40(6):1913-1926, 2011. 

D. Spielman and S. Teng. Spectral sparsification of graphs. SIAM Journal on 
Computing, 40(4):981-1025, 2011. 

Daniel A. Spielman and Shang-Hua Teng. Nearly-linear time algorithms for pre¬ 
conditioning and solving symmetric, diagonally dominant linear systems. SIAM. J. 
Matrix Anal. & Appl, 35:835885, 2014. 

Nikolai Tchudakoff. On the difference between two neighbouring prime numbers. 
Rec. Math. [Mat. Sbornik] N.S., 1(6):799-814, 1936. 

Pravin M. Vaidya. Solving linear equations with symmetric diagonally dominant 
matrices by constructing good preconditioners. Unpublished manuscript UIUC 1990. 
A talk based on the manuscript was presented at the IMA Workshop on Graph 
Theory and Sparse Matrix Computation, October 1991, Minneapolis., 1990. 


42 


[ZGL03] X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi-supervised learning using gaussian 
fields and harmonic functions. ICML, 2003. 

A Weighted Expander Constructions 

In this section, we give a linear time algorithm for computing linear sized spectral sparsifiers 
of complete and bipartite product demand graphs. Recall that the product demand graph with 
vertex set V and demands d : V -+ M>o is the complete graph in which the weight of edge (u, v ) 
is the product d u d v . Similarly, the bipartite demand graph with vertex set U U V and demands 
d : U U V -+ ffi>o is the complete bipartite graph on which the weight of the edge (u, v) is 
the product d u d v . Our routines are based on reductions to the unweighted, uniform case. In 
particular, we 

1. Split all of the high demand vertices into many vertices that all have the same demand. 

This demand will still be the highest. 

2. Given a graph in which almost all of the vertices have the same highest demand, we 

a. drop all of the edges between vertices of lower demand, 

b. replace the complete graph between the vertices of highest demand with an expander, 
and 

c. replace the bipartite graph between the high and low demand vertices with a union 
of stars. 

3. To finish, we merge back together the vertices that split off from each original vertex. 

We start by showing how to construct the expanders that we need for step (2b). We state 
formally and analyze the rest of the algorithm for the complete case in the following two sections. 
We explain how to handle the bipartite case in Section [A.31 

Expanders give good approximations to unweighted complete graphs, and our constructions 
will use the spectrally best expanders—Ramanunan graphs. These are defined in terms of the 
eigenvalues of their adjacency matrices. We recall that the adjacency matrix of every d-regular 
graph has eigenvalue d with multiplicity 1 corresponding to the constant eigenvector. If the 
graph is bipartite, then it also has an eigenvalue of — d corresponding to an eigenvector that 
takes value 1 on one side of the bipartition and —1 on the other side. These are called the trivial 
eigenvalues. A d-regular graph is called a Ramanujan graph if all of its non-trivial eigenvalues 
have absolute value at most 2 y/d — 1. Ramanujan graphs were constructed independently by 
Margulis |Mar88] and Lubotzky, Phillips, and Sarnak |LPS88| . The following theorem and 
proposition summarizes part of their results. 

Theorem A.l. Let p and q be unequal primes congruent to 1 modulo 4■ If p is a quadratic 
residue modulo q, then there is a non-bipartite Ramanujan graph of degree p + 1 with q 2 (q — l)/2 
vertices. If p is not a quadratic residue modulo q, then there is a bipartite Ramanujan graph of 
degree p+1 with q 2 (q — 1) vertices. 

The construction is explicit. 
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Proposition A.2. If p < q, then the graph guaranteed to exist by Theorem \A.1\ can be con¬ 
structed in parallel depth O(logra) and work 0(n), where n is its number of vertices. 

Sketch of proof. When p is a quadratic residue modulo q, the graph is a Cayley graph of 
PSL(2, Z/qZ). In the other case, it is a Cayley graph of PGL(2, Z/qZ). In both cases, the 
generators are determined by the p + 1 solutions to the equation p = Oq + af + a 2 + a 2 where 
ao > 0 is odd and 01 , 02 , and 03 are even. Clearly, all of the numbers ao, ai, 02 and 03 must be 
at most y/p. So, we can compute a list of all sums Oq + a 2 and all of the sums a| + a§ with work 
0(p), and thus a list of all p + 1 solutions with work 0{p 2 ) < 0(n). 

As the construction requires arithmetic modulo q, it is convenient to compute the entire 
multiplication table modulo q. This takes time 0(q 2 ) < 0{n). The construction also requires the 
computation of a square root of —1 modulo q , which may be computed from the multiplication 
table. Given this data, the list of edges attached to each vertex of the graph may be produced 
using linear work and logarathmic depth. □ 

For our purposes, there are three obstacles to using these graphs: 

1. They do not come in every degree. 

2. They do not come in every number of vertices. 

3. Some are bipartite and some are not. 

We handle the first two issues by observing that the primes congruent to 1 modulo 4 are suffi¬ 
ciently dense. To address the third issue, we give a procedure to convert a non-bipartite expander 
into a bipartite expander, and vice versa. 

An upper bound on the gaps between consecutive primes congruent to 1 modulo 4 can be 
obtained from the following theorem of Tchudakoff. 

Theorem A.3 ( |Tch36j ). For two integers a and b, let pi be the ith prime congruent to a modulo 
b. For every e > 0, 

Pi +1 ~ Pi < o(Pi /4+e ). 

Corollary A.4. There exists an no so that for all n > uq there is a prime congruent to 1 
modulo 4 between n and 2n. 

We now explain how we convert between bipartite and non-bipartite expander graphs. To 
convert a non-bipartite expander into a bipartite expander, we take its double-cover. We recall 
that if G = (V, E) is a graph with adjacency matrix A, then its double-cover is the graph with 
adjacency matrix 



It is immediate from this construction that the eigenvalues of the adjacency matrix of the 
double-cover are the union of the eigenvalues of A with the eigenvalues of — A. 

Proposition A.5. Let G be a connected, d-regular graph in which all matrix eigenvalues other 
than d are bounded in absolute value by A. Then, all non-trivial adjacency matrix eigenvalues 
of the double-cover of G are also bounded in absolute value by A. 
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To convert a bipartite expander into a non-bipartite expander, we will simply collapse the 
two vertex sets onto one another. If G = (U U V, E) is a bipartite graph, we specify how the 
vertices of V are mapped onto U by a permutation ir : V —>■ U. We then define the collapse of 
G induced by it to be the graph with vertex set U and edge set 


{(u,vr(u)) : {u,v) <E E} . 


Note that the collapse will have self-loops at vertices u for which (u,v) € E and u = n(v). We 
assign a weight of 2 to every self loop. When a double-edge would be created, that is when 
(7r(v), 7 r” 1 (u)) is also an edge in the graph, we give the edge a weight of 2. Thus, the collapse 
can be a weighted graph. 

Proposition A.6. Let G be a d-regular bipartite graph with all non-trivial adjacency matrix 
eigenvalues bounded by X, and let H be a collapse of G. Then, every vertex in H has weighted 
degree 2d and all adjacency matrix eigenvalues of H other than d are bounded in absolute value 
by 2A. 

Proof. To prove the bound on the eigenvalues, let G have adjacency matrix 

°r A ). 

A t 0 ) 


After possibly rearranging rows and columns, we may assume that the adjacency matrix of the 
collapse is given by 

A + A t . 


Note that the self-loops, if they exist, correspond to diagonal entries of value 2. Now, let x be 
a unit vector orthogonal to the all-ls vector. We have 


x T (A + A t )x 





< 2A, 


as the vector \x \ x] is orthogonal to the eigenvectors of the trivial eigenvalues of the adjacency 
matrix of G. □ 


We now state how bounds on the eigenvalues of the adjacency matrices of graphs lead to 
approximations of complete graphs and complete bipartite graphs. 

Proposition A.7. Let G be a graph with n vertices, possibly with self-loops and weighted edges, 
such that every vertex of G has weighted degree d and such that all non-trivial eigenvalues of 
the adjacency matrix of G have absolute value at most A < d/2. If G is not bipartite, then 
(n/d)L G is an e-approximation of K n for e = (21n2)(A )/d. If G is bipartite, then ( n/d)L G is 
an e-approximation of K n ^ n for e = (21n2)(A)/d. 

Proof. Let A be the adjacency matrix of G. Then, 

L g = dl — A. 

In the non-bipartite case, we observe that all of the non-zero eigenvalues of Lx n are n, so 
for all vectors x orthogonal to the constant vector, 

x t Lk„x = nx T x. 


45 






As all of the non-zero eigenvalues of Lq are between d — A and d+ A, for all vectors x orthogonal 
to the constant vector 


n 


1 — — ) x T x < x T (n/d)Lcx < n 


1 + — in. 


Thus, 

(l - -fj L Kn m La (l + 2) L 

In the bipartite case, we naturally assume that the bipartition is the same in both G and 
K n>n . Now, let x be any vector on the vertex set of G. Both the graphs K n>n and ( n/d)G have 
Laplacian matrix eigenvalue 0 with the constant eigenvector, and eigenvalue 2 n with eigenvector 
[1; — 1]. The other eigenvalues of the Laplacian of K n ^ n are n, while the other eigenvalues of the 
Laplacian of ( n/d)G are between 


Thus, 


n 




and n [ 1 + 



l k„,„ 4 L g ^ 



The proposition now follows from our choice of e, which guarantees that 


e e < 1 — X/d and l + A/d<e e , 


provided that X/d < 1/2. □ 

Lemma A.8 . There are algorithms that on input n and e > n -1 / 6 produce a graph having 
0(n/e 2 ) edges that is an 0(e) approximation of K n i or K n i >n / for some n < n' < 8 n. These 
algorithms run in O(logn) depth and 0(n/e 2 ) work. 

Proof. We first consider the problem of constructing an approximation of K n ',n' . By Corol¬ 
lary [A4] there is a constant no so that if n > no, then there is a prime q that is equivalent to 1 
modulo 4 so that q 2 (q — 1) is between and n and 8 n. Let q be such a prime and let n' = q 2 (q — l). 
Similarly, for e sufficiently small, there is a prime p equivalent to 1 modulo 4 that is between 
W 2 /2 and W 2 . Our algorithm should construct the corresponding Ramanujan graph, as de¬ 
scribed in Theorem IA.1I and Proposition IA.2I If the graph is bipartite, then Proposition IA.7I 
tells us that it provides the desired approximation of K n i n i. If the graph is not biparite, then we 
form its double cover to obtain a bipartite graph and use Proposition IA.5I and Proposition IA.7I 
to see that it provides the desired approximation of K n i tn i. 

The non-bipartite case is similar, except that we require a prime q so that q 2 (q — l)/2 is 
between n and 8 n, and we use a collapse to convert a bipartite expander to a non-bipartite one, 
as analyzed in Proposition IA.61 □ 

In Section [3 we just need to know that there exist graphs of low degree that are good 
approximations of complete graphs. We may obtain them from the recent theorem of Marcus, 
Spielman and Srivastava that there exist bipartite Ramanujan graphs of every degree and number 
of vertices |MSS15j . 
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Lemma A.9. For every integer n and even integer d, there is a weighted graph on n vertices 
of degree at most d that is a 4/y/d approximation of K n . 

Proof. The main theorem of |MSS15| tells us that there is a bipartite Ramanujan graph on 2n 
vertices of degree k for every k < n. By Propositions IA.6I and IA.71 a collapse of this graph is 
a weighted graph of degree at most 2k that is a (4In 2)/approximation of K nn . The result 
now follows by setting d = 2k. □ 

A.l Sparsifying Complete Product Demand Graphs 

Our algorithm for sparsifying complete product demand graphs begins by splitting the vertices 
of highest demands into many vertices. By splitting a vertex, we mean replacing it by many 
vertices whose demands sum to its original demand. In this way, we obtain a larger product 
demand graph. We observe that we can obtain a sparsifier of the original graph by sparsifying 
the larger graph, and then collapsing back together the vertices that were split. 

Proposition A.10. Let G be a product demand graph with vertex set {1, ... ,n} and demands 
d, and let G = (V,E) be a product demand graph with demands d. If there is a partition of V 
into sets Si,... ,S n so that for all i € V, Ylj&Si 4/ = then G is a splitting of G and there is 
a matrix M so that 

L g = MLqM t . 

Proof. The (i, j) entry of matrix M is 1 if and only if j € Si. Otherwise, it is zero. □ 

We now show that we can sparsify G by sparsifying G. 

Proposition A.11. Let G\ and G 2 be graphs on the same vertex set V such that G\ G 2 
for some e. Let Si,..., S n be a partition of V, and let G 1 and G 2 be the graphs obtained by 
collapsing together all the vertices in each set Si and eliminating any self loops that are created. 
Then 

G\ » e G 2 . 

Proof. Let M be the matrix introduced in Proposition lA.IOl Then, 

L Gl = ML di M T and L G2 = MLq 2 M t . 

The proof now follows from Fact 13.21 □ 

For distinct vertices i and j, we let ( i,j ) denote the graph with an edge of weight 1 between 
vertex i and vertex j. If i = j, we let (i,j) be the empty graph. With this notation, we can 
express the product demand graph as 

^2didj{i,j ) = didj(i,j). 

i<j ij'SV 

This notation also allows us to precisely express our algorithm for sparsifying product demand 
graphs. 
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G' = WEIGHTEDExPANDER(d, e) 

1. Let n be the least integer greater than 2 n/e 2 such that the algorithm described in Lemma 
IA. 8 I produces an e-approximation of Kh- 

2. Let t = ^4^. 

n 

3. Create a new product demand graph G with demand vector d by splitting each vertex i 
into a set of \di/t] vertices, Sf. 

(a) [di/t\ vertices with demand t. 

(b) one vertex with demand di — t [di / 1\ . 

4. Let H be a set of n vertices in G with demand t, and let L contain the other vertices. Set 
k = \L\. 

5. Partition H arbitrarily into sets Vi,..., Vj., so that V,| > [h/k\ for all 1 < i < k. 

6 . Use the algorithm described in Lemma [A.81 to produce Khh, an e-approximation of the 
complete graph on H. Set 

g = t 2 k HH + Y, W\ s h )- 

l&L l ' h&Vi 

7. Let G' be the graph obtained by collapsing together all vertices in each set S t . 


This section and the next are devoted to the analysis of this algorithm. Given Proposi¬ 
tion [ATlJ we just need to show that G is a good approximation to G. 

Proposition A. 12. The number of vertices in G is at most n + n. 

Proof. The number of vertices in G is 

y] \di/t\ <n + ^ di/t = n + h. 
i&V i&V 


□ 

So, k < n and h > 2 k/e 2 . That is, \H\ > 2 \L\ /e 2 . In the next section, we prove the lemmas 
that show that for these special product demand graphs G in which almost all weights are the 
maximum, our algorithm produces a graph G that is a good approximation of G. 

Theorem A. 13. Let 0 < e < 1 and let G be a product demand graph with n vertices and 
demand vector d. Given d and e as input, WeightedExpander produces a graph G' with 
0{n/e 4 ) edges that is an O(e) approximation of G. Moreover, WeightedExpander runs in 
O(logn) depth and O (n/e 4 ) work. 

Proof. The number of vertices in the graph G will be between n + 2 n/e 2 and n + 16n/e 2 . So, 
the algorithm described in Lemma IA.8I will take O(logn) depth and 0(n/e 4 ) work to produce 
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an e approximation of the complete graph on n vertices. This dominates the computational cost 
of the algorithm. 

Proposition I A. Ill tells us that G' approximates G at least as well as G approximates G. To 
bound how well G approximates G, we use two lemmas that are stated in the next section. 
Lemma I A. 15 1 shows that 

Ghh + Glh ~ o ( e 2 ) G. 

Lemma lA. 171 shows that 


Ghh + Glh 


~4e 


Ghh + ^2 

leL 


\H\ 

W\ 


^ ' di dh ( l j h). 

h&Vi 


And, we already know that t 2 K is an e-approximation of Ghh- Fact 13.31 says that we 
combine these three approximations to conclude that G is an 0(e)-approximation of G. 


can 

□ 


A.2 Product demand graphs with most weights maximal 

In this section, we consider product demand graphs in which almost all weights are the maximum. 
For simplicity, we make a slight change of notation from the previous section. We drop the hats, 
we let n be the number of vertices in the product demand graph, and we order the demands so 
that 

d\ T ^ ^ dfc T dfc -— ■ — d n — 1. 

We let L = {1,..., k) and H = {k + 1,..., n} be the set of low and high demand vertices, 
respectively. Let G be the product demand graph corresponding to d , and let Gll , Ghh and 
Glh be the subgraphs containing the low-low, high-high and low-high edges repsectively. We 
now show that little is lost by dropping the edges in Gll when k is small. 

Our analysis will make frequent use of the following Poincare inequality: 

Lemma A. 14. Let c(u, v) be an edge of weight c and let P be a path from from u to v consisting 
of edges of weights ci, C 2 , ■ ■ ■ , c*,. Then 

c(u, v) Pc (JT cr 1 ) P. 

As the weights of the edges we consider in this section are determined by the demands of 
their vertices, we introduce the notation 

[i,j] = didj(i,j). 

With this notation, we can express the product demand graph as 

i<j i,j£V 


Lemma A.15. If \L\ < \H\, then 


Ghh + Glh \l\ G. 

d w\ 
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Proof. The lower bound Ghh + Glh 7 Ghh + Glh + Gll follows from Gll h 0 . 

Using lemma IA.14I and the assumptions di < 1 for l G L and and dh = 1 for h G H, we 
derive for every Zi, Z 2 G L, 

[Zi, Z 2 ] = T-TTr^ ^2 fr’k] 

hi,h 2 eH 

(by Lemma fA. 14ft 


-4 


-4 


1 

3 

i/ 

_3_ 

77 


2 ^ j 7i di 2 

hi,h 2 £H 


(—1— 

\d[ 1 


2 ^2 ([7;^l] + [^l) 

hi,h 2 £H 


^2 ([7> h] + [h,h]) + 

heH 


+ 


dfil dh 2 


+ 


dh2 dfo 


^ 2 ] + [^ 2 ? fa]) 
jhf GHH - 


([7, h\] + \h\,hf\ + [/ 12 , h\) 


So, 


Gll = - ^ [772] 


/l ,l2^L 


1 


- 9 ( jT/l ^ + t^ 2 ’ ^1) + ffff2 GnH 


h,h 

31L 


h&H 


i/ 


, . G 7f/ + ^6W 
|#| |Lfr 

The assumption |L| < |iL| then allows us to conclude 


Ghh + Glh + Gll 7 ( 1 + 3-jLy ) {Ghh + Glh) 


□ 

Using a similar technique, we will show that the edges between L and H can be replaced by 
the union of a small number of stars. In particular, we will partition the vertices of H into k 
sets, and for each of these sets we will create one star connecting the vertices in that set to a 
corresponding vertex in L. 

We employ the following consequence of the Poincare inequality in Lemma lA.141 
Lemma A.16. For any e < 1, l G L and h\,}i 2 G H , 

e[/ii, Z] + (l/2)[hi, hf\ e[/i2, Z] + (1/2)[Zii, ^ 2 ]• 
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Proof. By applying Lemma [A. 141 and recalling that d/ 11 = dh 2 = 1 and di < 1, we compute 

1 


[hi,l\ P d hl di 

1 + y/e 

7 ^ 


^ + 1 


-<! 


dh\dfi 2 dh 2 di J \ y/c 

[hi, h 2 ] + (1 + \fe)[h 2 , 


[hi, h-2\ + [h2,l] 


^ (1 + V~e)[h 2 ,l] H— i=[hi, hf\. 

v e 

Multiplying both sides by e and adding (l/2)[/ii, hf\ then gives 

e[h±, /] + (l/2)[/ii, h 2 ] (1 + \fe)e[h 2 , £] + (2\/e + l/2)[h\, hf\ 

(1 + 4-y/e) (e[h 2 , + (l/2)[hi, h 2 ]) 

< e 4 ^(e[h 2 ,l] + (l/2)[hi, h 2 ]). 

By symmetry, we also have 

e[h 2 ,l] + (l/2)[hi,h 2 \ ^ e^(e[hi,l] + (l/2)[hi, h 2 ]). 


□ 


Lemma A. 17 . Recall that L = {1,..., k} and let V\,... , Vk be a partition of H = {k + 1,..., n} 
so that \ Vi\ > s for all l. Then, 

|^| 

Ghh + Glh ~4/Vi G hh + E TTy E ^]- 

leL 1 11 heVi 

Proof. Observe that 

Glh = E E 

l&L heH 

For each l £ L, h\ € H and h 2 £ Vi we apply Lemma lA. 161 to show that 

I + \[hiM « 4/V5 |7[/,h 2 ] + i[/n,/i 2 ]. 


Summing this approximation over all h 2 £ Vi gives 

[l,hf\ + E tA h\,h 2 ] 


"4/>/5 




h, 2 &Vi h 2 eVi 

Summing the left-hand side of this this approximation over all l € L and hi € H gives 

E [^1] + E 2^ 1 ’^ = E [^^i] + 2 E E [^1 ■ ^2] = Glh + Ghh- 

l€L,hieH h 2 eVi ieL,hieH z hieH,ieLh 2 ev t 

On the other hand, the sum of the right-hand terms gives 

Ghh+ E E j-^| [l, h 2] = G H H + J2 E TL^’71- 


l£L,hiGH h 2 eVi 


leL h 2 eVi 


N 


□ 
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A.3 Weighted Bipartite Expanders 

This construction extends analogously to bipartite product graphs. The bipartite product 
demand graph of vectors ( d A , d B ) is a complete bipartite graph whose weight between ver¬ 
tices i € A and j € B is given by Wij = dfd B . Without loss of generality, we will assume 
d A > d A > ■ ■ ■ > df A and d B > d B > • • • > d B g . 

As the weights of the edges we consider in this section are determined by the demands of 
their vertices, we introduce the notation 

[hj] = dfdf(i,j). 

Our construction is based on a similar observation that if most vertices on A side have df 
equaling to df and most vertices on B side have df equaling to df, then the uniform demand 
graph on these vertices dominates the graph. 


G' = WEIGHTEDBlPARTITEEXPANDER(d A , d B , e) 

1. Let n' = ma x(n A , n B ) and n be the least integer greater than 2n / /e 2 such that the algorithm 
described in Lemma IA.8I produces an e-approximation of K- n ^. 

2. Let t A = ^ k ff k and t B = . 

^ A a B 

3. Create a new bipartite demand graph G with demands d and d follows: 

(a) On the side A of the graph, for each vertex i. create a subset Si consisting of 
vertices: 

i. \(1 A It A \ with demand t A . 

ii. one vertex with demand df — t A . 

(b) Let H a contain h vertices of A of with demand t A , and let L A contain the rest. Set 
k A = \L a \. 

s' B 

(c) Create the side B of the graph with partition H B , L B and demand vector d similarly. 

4. Partition H A into sets of size \V A \ > [h/A;" 4 ], one corresponding to each vertex l € L A . 
Partition Vb similarly. 

5. Let K h a h b be a bipartite expander produced by Lemma [A.81 that e-approximates K^n, 
identified with the vertices H A and H B . 

Set 

G = t A t B K+ J2 H E d A d B (l,h)+J2 
i£L A I 1 I hev t B i£L B 

6. Let G' be the graph obtained by collapsing together all vertices in each set S A and Sf. 


\H \ 

W\ 


E dfd A {l,h). 


hez\/A 


Similarly to the nonbipartite case, the Poincare inequality show that the edges between low 
demand vertices can be completely omitted if there are many high demand vertices which allows 
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the demand routes through high demand vertices. 

Lemma A.18. Let G be the bipartite product demand graph of the demand ( df , d B ). Let H A 
a subset of vertices on A side with demand higher than the set of remaining vertices L A on A 
side. Define H B ,L B similarly. Assume that \L A \ < \H A \ and \L B \ < then 

G H A H B + G H A L B + G L A H B K. / \ L A | | L S|\ G. 

Proof. The proof is analogous to Lemma fA. 151 but with the upper bound modified for bipartite 
graphs. 

For every edge IaJbi we embed it evenly into paths of the form l a, Lb, Ib over all choices 
of La and h B . The support of this embedding can be calculated using Lemma lA.141 and the 
overall accounting follows in the same manner as Lemma lA.151 

□ 


It remains to show that the edges between low demand and high demand vertices can be 
compressed into a few edges. The proof here is also analogous to Lemma IA.161 we use the 
Poincare inequality to show that all demands can routes through high demand vertices. The 
structure of the bipartite graph makes it helpful to further abstract these inequalities via the 
following Lemma for four edges. 

Lemma A.19. Let G be the bipartite product demand graph of the demand (d A . d B ). Given 
hAi Ia £ A and hs, i , h/ 3.2 € B. Assume that d A A = d B g = d B g > df A . For any e <1 , we have 

e[^A, hs ,\] + [Iia, hs, 2] + [hAi he, 1] ^375 e [^Ai hs, 2] + [hA, h B)2 ] + [hA, h Bt i\- 

Proof. Using Lemma I A. 141 and d A A = d B g = d B g > d A , we have 


[lAi hs, 1 ] 

< d A d B ( ^ 

- lA \d A d B 

\ l A n.B, 2 


+ 




d A d B 

B'A flB, 2 


+ 




d h A d h B ,i. 


[lA,hs, 2 ] H— i=[hA, h/ 3 , 2 ] H— i=[hA,h B} 1 ] 

v e v e 


=< (1 + 2 V~e)[l A , h B o] + l -^[hAi h B , 2 \ + i±|^[/»A, h Bt J. 






Therefore, 


c[Ia, h B , 1 ] + [hA, h B , 2 ] + [hAi hB, 1 ] ^ (1 + 3\/i) (c[Ia, hs, 2 ] + [hA, h B , 2 ] + [h.A, h B , 1 ]) • 


The other side is similar due to the symmetry. □ 

Theorem A.20. Let 0 < e < 1 and let G be a bipartite demand graph with n vertices and 
demand vector ( d A , d B ). WeightedBipartiteExpander produces a graph G' with 0(n/e 4 ) 
edges that is an O(e) approximation of G. Moreover, WeightedBipartiteExpander runs in 
O(logn) depth and 0(n/ e 4 ) work. 
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Proof. The proof is analogous to Theorem 1 A. 131 After the splitting, the demands in H A are 
higher than the demands in L A and so is H B to L B . Therefore, Lemma |A. 181 shows that that 

Gh a H b + G h a l b + G l a h b ~ 3 e 2/2 G. 

By a proof analogous to Lemma lA.171 one can use Lemma |A. 191 to show that 

Gh a h b + G h a l b +G l a h b ~o(e) G h a h b + ! J ^ dfd B (l, h) + y] ! y A ! ^ dfd A (l,h). 

' 1 ' hev t B i£L B I 1 I h£V A 

And, we already know that t A t B K is an e-approximation of G h a h b. Fact 13.31 savs that we can 
combine these three approximations to conclude that G is an 0(e)-approximation of G. □ 
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